A conversation with Kyle Shannon and his data science class
Alex Liebscher: I’m from San Jose, California, and I went to school at UCSD. I came in as a freshman in 2016 as an economics major. This was something that I really thought I wanted to do. I took one econ class and decided that I really didn’t like it. I started reaching out to grad students and professors and talking to my friends, trying to get a feel for what other people were doing and why they were doing it. I think the most influential conversation I had was with a grad student from the cognitive science department. He opened up my eyes to cognitive science and what that was, and how psychology could interact with data. That was a really pivotal moment for me.
So I changed my major to cognitive science and I continued on that path, still not really knowing what I wanted to do. I liked neuroscience, and the idea of intelligence, and mixing in philosophy and stuff really struck me. And so I dug a little further in on this. The summer after my freshman year I worked as a landscaper up near Tahoe. I only say this, and I’ll get back to it later, to make a point that you don’t always have to have a data science internship every single summer of your college career.
Going into my sophomore year, I got a lab position with a psychology lab on campus. We were looking at neural signals in song birds. I thought this taste of research was intellectually stimulating, but also boring at the same time. In general, I liked the people that I was working with, but I didn’t really connect that well with anyone. I just floated through this research position for my sophomore year, not knowing where I was heading. Only at the end of that year did I finally get involved in a project and start to exercise some of the skills that I was learning in my classes. It was at this point that I realized how interesting research could be, and how you could mix math with data. I pushed hard to find an internship somewhere in the data field for the summer after my sophomore year. I ended up reaching out to someone in an HR department at a company called TextRecruit. They were about a 60 person startup in downtown San Jose. I sent someone at their HR a message on LinkedIn and introduced myself. I said I was looking for a data science type position, and one thing led to another. At TextRecruit, I did some data analysis on a big data set of basically text messages. They had me come in and figure out, basically, is their chatbot product working, what is it doing, and where can we improve things. This was a good experience for me because they didn’t have any big expectations for me; they let me loose with their data so I could learn things. This internship is what introduced me to natural language processing, machine learning, neural networks, and data engineering.
I went into my junior year with this idea that data, language, and machine learning were really cool things. I joined a lab in the cognitive science department. I got started on one of the projects that a graduate student in the lab and I had talked about. He held my hand the first few months as I got to know the work that they were doing and why they were doing it. Throughout the rest of my junior year I was taking, for example, the COGS 118 series, and I ramped up my math background. I took some probability classes, and statistics classes, getting the tools to understand what I could do with machine learning and natural language processing. I continued on with the research that I was working on, which revolved around how the metaphors that we use in our daily communication affect how we think and how we behave. This was not so much machine learning and natural language processing, it was much more cognitive linguistics and psychology. But it introduced me to research methods, experiment design, the process of doing research, and how you come up with research questions.
I looked around for an internship between between my junior and my senior year. I wasn’t really on the ball with it and I didn’t end up getting an internship for that summer. I think I was expecting it to be as easy as the summer before: reach out to a couple of recruiters on LinkedIn and land something. Well, it wasn’t that easy. Instead of doing an internship, I continued my research over that summer. Again, I took a job as a landscaper, and I also drove for Lyft.
Going into my senior year, I had made some good progress on my research. We were finalizing a paper for an upcoming cognitive science conference. That was really exciting to learn about the formalisms of research. I was taking mostly machine learning courses and math courses. I took Leon Bergen’s NLP course, in the linguistics department. By my senior year, I was specializing in natural language processing because that’s where my background was in and where my interests were. When talking to people, I was selling myself as an NLP engineer. March of my senior year was when COVID hit. Prior to this I had started job searching for a full time NLP engineer position after graduation. I had a couple of good conversations, a couple interviews, none of which really went anywhere. I learned some valuable lessons about how to interview and how to search for jobs. I slowed down a little bit the last quarter of my senior year. I took two classes and also did the cognitive science honors thesis program. But I had to take a step back and figure out, How can I get a full time job? I started talking to some people that I knew, and they gave me some ideas for how to go about a job search. It was about this time that I learned the importance of networking and connecting with other people in my field. I ended up not finding a job by the time that I graduated. I heavily ramped up my networking. I started talking to data scientists, machine learning engineers, data analysts, data engineers, basically anyone that was within the general area that I was in. My goal with this was to broaden the number of people that I could reach out to should I have questions, or if I needed something professionally. But also I was just learning about careers.
I had this idea that I wanted to be an NLP engineer, but at the same time it felt almost impossible because it seemed like every NLP engineer or machine learning engineer job application was asking for 5, 10, 15 years of experience. This was really frustrating, and so I was trying to figure out what I could do either in the meantime or instead of that path. Soon though, a friend of mine from UCSD reached out. He was very involved in the Basement, and has this tech entrepreneur spirit in him. He asked if I was doing anything and if I was looking for work. He said that he was building up a team and was looking for an NLP engineer and felt like I could make a good fit. I had maybe 3-5 interviews going on at that point, and he was eager to find someone to fill the position. I figured that any experience is better than no experience, and nothing was sure with the other interviews that I was in.
I took this position with a very small startup: there were six of us, including me. My responsibilities here started out with working on their optical character recognition process. They had millions of property and real estate PDF documents. The goal was to OCR those, and extract the text from them so that we could display that text to the end user. I started out doing this OCR work, and it was my job to figure out how we could OCR all 20 or 30 million documents. This was unlike anything that I had worked on the past. With my research position I was working with a fairly small data set. And no longer was it just a simple data format, numerical or anything like that. These were PDFs, which are notoriously difficult to work with. We were on a tight timeline for that, so my position soon shifted into more of the NLP stuff. After we got all the text drawn out from these documents, we needed to come up with some models or some heuristics to pull out the interesting information. This was my favorite part of the role. I ended up not doing a whole lot of machine learning in this section of my role. Machine learning, at least the state that it’s at right now, especially for natural language processing, requires you to have a huge labeled training set. We didn’t have this and we certainly didn’t have the budget to create it. We had to sidestep that, put it on the back burner and figure out what we could do that didn’t involve having a huge labeled data set. The rest of the company was building out the product and talking to potential customers. In the end, we weren’t able to sell the product, and we ran out of money and had to fold. The last few weeks were immensely frustrating for me. I had to put on my marketing and sales hats and make cold calls and reach out to people, and do things that I was not trained for and was not interested in. Although it was really frustrating, I learned a lot of valuable lessons from this. I have a hunch that any engineer, any computer science or mathematics person, that goes to work in the corporate world should be required to take some sort of class or internship as a salesperson. The two go hand in hand. If you’re an engineer, you don’t have a job if there’s no one selling. If you’re a salesperson, you don’t have anything to sell if there’s not an engineer. I think that that relationship isn’t valued enough.
Anyways, we folded and I was reaching out to old connections of mine. One of them, Andrew, was working at a company called BetterUp. He was looking for someone with my skill set to take on fairly soon and we decided that it would be a good fit. That’s when I joined BetterUp as a research assistant. Since then I’ve done a variety of different things. I started out working on an internal search engine that brought in a lot of my DevOps and NLP engineering skills. I got to dig in to information retrieval and big data sets, and this was really fun. We built out an excellent prototype that got a lot of people jazzed. I’ve also been doing various survey analyses. I have been taking part recently in a broader research goal of studying how we communicate, mechanically, in conversations. So looking at things like how long are my turns, or utterances? How long am I speaking for? How long do I wait after you talk to for me to talk again? And then I’ve also been helping build experiments to study how people behave in the workplace. A big aspect of what makes you happy at work is how you communicate.
Kyle Shannon: Great, thanks for that overview. You think about brain when you spoke even you speak to most data scientists machine my engineers or anybody that kind of fall other than that discipline. Um He always kind of hear a different story about how to kind of fall into that job, like if you talk to maybe like a traditional software engineer, it’s typically like I went to school first yes degree or you know, it’s like a boot camp and then kind of got internship and work that way, but at least in my experience data scientists kind of it’s kind of a hodgepodge of all different types of backgrounds and experiences that kind of through people into that line of work. So it’s interesting to hear.
AL: I think something that I’ve been learning recently is that there’s no such thing is no such thing as a bad professional experience. My research has been incredibly useful in building my skill set. My experience at a startup was incredibly valuable. My jobs as a Lyft driver and as a landscaper have been built out my problem solving tools. I think as a data scientist there’s no “bad” path or one set path that someone should or needs to take in order to become a data scientist.
KS: A lot of people tend to get hung up on the tools, techniques, and analysis that you have to do to do data science or any sort of STEM based project. But I always find the most valuable skill and the skill that I think even like lot junior engineers and scientists have a hard time getting really good at is just communication. How do you effectively communicate in a simple and fast way? What is it that you’re trying to do or what you need? Or like you said like sales people are very good, like good sales people are very good at communicating in a very good at finding out what are the problems that somebody has and how kind of match solution to that problem in a way that allows them to think that I really care about solving the problem for them, which we need to do, but also that we have a product that you can use. Um consequently they also do a bad job sometimes of reassuring customers that we can solve something for you because they don’t have a good conversation or a good communication with the engineering team. So they’ll come back and say, what do you mean? You can’t influence this feature this way. I told the, you know, the customer that we could and so that communication of how to communicate on it with your colleagues, but also with people who are not technical or we don’t necessarily know the work that you’re doing is um, one of the biggest things that I think people going into any stem field can learn to get better at, it’s really important.
AL: One of the things that I’ve been realizing lately is the importance of writing and documenting and being able to communicate effectively. I recommend you write what you do and write how you feel and write down your thought process. People want to know how you got from A to B, and you need to be able to communicate that. I’ve found that a lot of engineers can come up with a solution, beautiful or not, but struggle to communicate why they did that or how they did that. As if they never really thought critically about why they did that. It’s been frustrating to see and introspect on how I do that, and then try to improve from there. I agree with you, communication is important.
The other thing that you just made me think of is how bogged down the data scientist gets on the tools and the technology. Someone that I talked to a couple months ago, asked what I was interested in as a machine learning engineer and as a data scientist. I said, Python and PyTorch and this and that, and all these tools and technologies. He came back to me basically saying, that’s great that you’re interested in tools, but you don’t seem to have a problem or an issue that you care about. I think it’s very, very easy to go through college thinking that knowing the tools and technologies is enough, when in fact it’s important to understand how to critically think. It seems like you can learn the tools from a variety of different ways, like taking the data science courses and the CogSci courses at UCSD is one way to learn it. But as I think about the classes that I took, a lot of them didn’t really teach me how to think. They taught me the tools and the math and the statistics behind everything, but on a day to day basis, that’s only half of the problem, and it’s hard to get past that desire to just want to know the technology and not the thought processes.
KS: Yeah, I think that’s a why often companies will ask for people who have masters or PhDs because at that level you’ve kind of transitioned from being asked to do problem sets, to where it’s more of, you set your own barometer of success with your own parameters. You have your own constraints and that’s up to you kind of fulfill these things, and that’s sort of a different way of approaching problems. As an undergrad, it always seems like it’s sort of this epic struggle between learning the right tools and techniques that are used in industry for a job, and also understanding like the holistic sense of how to learn new tools. And from a theoretical base, the process you would take to do something. At that point the tools are almost irrelevant in some way. And just with how quickly technology moves anyways, the tools are gonna change. I mean, everybody has like, other than any front end engineering, like, you know, it was like a new front end framework that comes out every two years, it’s just like uh you know, these things are constantly changing and evolving and so there’s that big balance of how do I learn just enough tooling and understand how to do tools to do something, but then also understand how to approach problems, how to ask interesting questions. Um and that’s because if you typically learn on the job or in further educational training, so you have to kind of like short circuit that a little bit, you know, through internships or working on like really interesting problems on your own and your own projects, You kind of like do that a little bit, you just like to talk about in your interviews.
AL: I couldn’t agree with you more. I think that it’s really useful to have the technical skill set. Every day I go back and forth between R and Python. It’s incredibly useful to be fluent in both of those so that I can utilize them when they should be utilized. I think data scientists, for example, get hung up on this R versus Python debate. But really you should know both because they both have their strengths and their weaknesses. You shouldn’t let a tool define the work that you do, or the questions that you ask. Instead, you should be coming up with the questions and apply the tool when necessary and where useful and appropriate.
KS: Do you recommend going to grad school for machine learning?
AL: I don’t know, I haven’t done it. I have been debating a PhD program for myself because I do really like research. Something that I’m realizing is that, at least in the PhD, it’s important to know what what you want to work on and who you want to work with, the latter being especially important. You’ll be spending at least four years working with the same group and the same advisor. It’s important that you enjoy the environment that you’re working in. The problems can always change. Say you pick a group and you don’t like the advisor and there’s no room to move around. You’ll drop out because you won’t enjoy it. If you choose an advisor or school who is flexible, then I think you’re setting yourself up for success. You’re the only person that’s going to know whether you want to go to grad school for machine learning. There are plenty of people who get machine learning engineer roles without going to grad school.
KS: That’s a good piece of advice for grad school. I think there are typically three reasons to go to grad school. One is because you want to go into academic research. You’re really interested in one specific area, and for that you have to go beyond just master study because you need to start doing research. If you are curious about one particular part of machine learning or something, then a PhD can really help you get there. But maybe you have a STEM background and you want to do machine learning engineering then maybe a master’s degree could be useful to fill that gap in knowledge. Or maybe for upgrading your career. So it’s kind of like a strategic option. You have grad school for the pure beauty of research, or as a strategic option, or maybe another reason. But have a good reason to do it, don’t just do it because you have nothing else to do.
AL: One more note about grad school: I’m extremely happy that I didn’t go to grad school immediately out of undergrad. I think that the undergraduate who goes to graduate school immediately and is truly happy with what they do, is few and far between. Getting a variety of experiences post-undergraduate can never be a bad idea. It will help you hone in on exactly what you care about and what interests you. Having industrial, corporate experience can also help you figure out how those tools can be applied in the real world.
KS: From my experiences in grad school, it wasn’t even so much material that I learned. Certainly that’s a big part of it, but the biggest part was the people that you meet and the connections that you make. These are people who are, for the most part, interested in working in the area that you’re interested in. You can rely on them when you’re looking for new opportunities. It’s kind of the social experience you have, and the more time you spend in industry beforehand, the easier it is to make those long lasting friendships because you bring something to the table in terms of your experience.
How would you recommend getting into data science positions or internships?
AL: Network. I have only ever gotten one interview through applying to online applications. Yet, I’ve gotten three jobs and a research position through networking. And I’ve made a number of really awesome connections through networking.
Networking is something that I heard of an undergrad; people were always saying, “Oh you should network,” and you’d hear this at career events and fairs, but no one ever really told me how or what that exactly entailed. It wasn’t until my senior year that I read a blog post that had this email template that said, among other things, how to explain your interest in meeting with that person. It was then that it clicked with me, that networking is probably the best way to land a position and if you don’t land the position, there are plenty of other benefits to networking.
All I really need to tell you is to go on LinkedIn and search up machine learning engineer at wherever. Or data scientist. Some of them have an email in their bios, some of them you can Google and find their home page. The goal is find an email address, and email them: “Hey, I found your profile. I really like the work that you’re doing at X company. It looks like you’re working on Y technologies. This is something that really fascinates me. I’m an undergraduate at UC San Diego and I am looking to grow my network. Do you have 15 minutes next week for an informational interview?” The key there is informational interview. The goal of an informational interview is an introduction. You two get to know each other a little bit and that’s about it. You try to understand what they do and you tell them about what you’re interested in, at least at this stage in your career. Don’t jump out of the gates asking for a job at their company. In some cases, you might not even ask for anything. You just are there to hear about their story. People love to talk about themselves. When it comes time to ask for something, the more specific you can be, the better. Keep a tab on their work history, on the jobs page of their company, and see what positions are maybe opening up on their team or with people that they work with. If that’s something that interests you, reach out to them and say, “We spoke a couple months ago, I see that there’s a new data science position open. Do you happen to know the hiring manager.” I think that this will give you much more success in your job search.
However it does take a little bit more planning and preparation and time. With an online job application, you tailor your resume and a cover letter and send it off and you’re done. It’s frankly fairly easy, but at the same time there is very little feedback and you have no idea when you’re going to hear back from them. On the contrary if you’re talking to a real person and particularly someone that you know, then I think you’re much more likely to have a direct introduction to someone that can get you hired or at least some sort of immediate feedback. As a new college graduate, people don’t really expect you to have a whole lot on your resume. So trying to tailor that for every single job application will be far less effective than networking.
KS: One person asked what were the most useful classes you took at UCSD for going into data science?
AL: Before I answer that, I’ll say that one of my regrets in college was not taking more classes outside of data science, machine learning, and cognitive science. College is a time to explore and to learn new things and when you branch out of your particular domain you learn new ways of thinking and solving problems. And that was something that I didn’t realize when I was signing up for classes. And so I ignored uh beautiful classes in anthropology and history and computer science. Even I wasn’t a computer science major. If you’re a computer science major then maybe it would be cognitive science or something. Um But uh what I I wouldn’t get too hung up on taking specific technical classes in college, there’s always going to be more technical skills that you can learn on the job or through other experiences. Uh That being said some of the ones that I really enjoyed the COGS 118 series. Um I really enjoyed Professor De Sa, She was a great teacher. Um I really enjoyed I already mentioned this earlier but Leon Bergen’s linguistics, natural language processing class. He did a really excellent job of explaining the mathematics behind um you know, LSTMs and recurrent neural networks. And at the very end of the quarter we even got into transformers which if you’re familiar with, that was something relatively new at that time. And so that was exciting to be learning about something extremely cutting edge right there in class and in particular like the mathematics behind it. Um Let’s see, I took a critical gender studies class on race, gender and AI um which was a really transformational class for me, it really pushed me to question a lot of the things that we learn as engineers, like asking why we’re implementing something and what are the consequences if we implement this? Which I think aren’t questions that engineers ask often enough. Um So I wouldn’t shy away at all from the humanities, particularly if it has um maybe more technical bend to it. Um I think that maybe like the sociology class or the sociology department or something like that probably has a couple classes on like society and technology. I think that would um uh push you to think about technology and new and different ways. I still talk about that critical gender studies class to um interviewers and stuff. Um People people like seeing when you branch out uh and think about your domain in new in different ways. Let’s see it. Um Math 181 um I think was like the mathematical statistics series that was a super useful um set of classes to take if you’re interested in basically understanding how like neural networks work. Um It covers mostly like regressions and things like that, but neural networks are fancy regressions, um, and having a good foundation in linear regressions and things like that will be another tool to add to your toolbox so that when it comes time to apply some of these skills, you’ll be able to know when it’s going to be perfectly okay to have a logistic regression versus uh, 10 layer neural network. Um, and your boss will appreciate it. If you can choose a solution that requires less money and less time is more interpret herbal. Um, let’s see. Let’s see COGS 118 math, 181. If, if you find a grad course that you’re interested in sign up for it. Um, even if you get like a B or maybe even a C. Um, it will put you in a new environment that will completely change uh, the flow that you’re used to an undergraduate. Um, I took a couple cognitive science graduate courses and they’re very small. Um 10-15 people. Most of the work is not necessarily application but it was more um revolving around research and reading papers and that sort of thing. Um Which uh I I found very useful. Um So if if you get a chance um you can sign up for definitely some of the colleagues graduate classes as an undergraduate you do need approval. I’m not sure if you can take any of the math graduate courses or the C. S. Graduate courses. I think the CS graduate courses you can but um both of those would be good areas to look into if you’re looking for an extra challenge but in a smaller group setting. Yeah. I don’t know I think I think the classes really don’t matter that much.
KS: Uh huh. Yeah there’s a it’s always good to get out of your your comfort zone with different classes and to try and learn to overcome that inner-adversity to doing something new because I think about when you go in your first, the first day of the new job, it’s going to be a new environment, you’re not used to and be a lot adversity that you overcome within yourself to branch out, talk to people and meet you people try and make new friends learn. How do all these new procedures and processes and feel like you don’t know anything. And so the more time you can spend putting yourself in those adverse environments as an undergrad when the stakes are not the same as they might be on your first job. Uh It’s you know, it’s probably fine and expected to fail as undergrad and to that process of failing, you always learn more and more so that you feel less hard for less often, right?
AL: Yeah, real quick. I’ll just say that, I think failure is great and I totally overestimated how much things mattered in undergraduate, I probably should have put myself in risky positions, taking classes that um you know, might not have contributed to my uh present technical knowledge right then, but over the long term would have let me fail and fail fast and learn something new at that moment.
KS: You said you were interested in finance and investing, did you ever considered a career in quantitative finance or applying data science at all in the financial world?
AL: I did briefly consider it, especially during my senior year when I was reaching out to people, I spoke to a couple quants um just to sort of hear about their day to day. Um And the technologies that they were working within the problems that they were solving and that sort of thing. Um Econ and investing in particular not so much finance or accounting still interests me but I know now that it’s not a career I would enjoy. Having people and humans or even just like something organic is really important to me. Um And it’s something that I really enjoy working on and being in a quantitative finance role I think for me personally it would be too rigid and not really working with the types of data and the types of problems that truly fascinate me. So yes if that’s a career considering um I encourage you to find people in the roles that you’re interested in and talk to them and see what they do want a day to day basis and the kinds of problems that they’re asking and answering.
KS: One person asked, when you get your first job, what advice do you have or like transitionary period to be successful in your new kind of job or new career?
AL: I think that’s a really good question and something that I’m still figuring out. Um I don’t think that there’s anyone like thing that you can do. However, um some actually more general advice that you could apply to more situations than just the first job. Ask a lot of questions and reach out to a lot of people. Um as a new person, you have this sort of grace period where you can ask any stupid question that you want and you can just blow it off, like other people just blow it off like oh they’re new to the role, it’s okay that they don’t understand this and this could be technical questions, or it could be questions about the culture or um questions about your own role or anything like that, and getting all those questions that you can think of out of the way as soon as possible. Um Not only uh build your own knowledge of where you’re working in the work that you’re doing, but it also sort of sets the tone for later in your career with this team or this company. Um where you just are like an inquisitive person with a lot of questions, which is never a bad thing. I think that um should always be asking questions, but to set that tone early on, like this is who I am. I’m a person to ask a lot of questions is a decent personality characteristic. Um And then depending on the size of your company, um Right now I’m at a company that’s almost 300 people prior to this, I was a company that was six people um at the company. Now that I’m with, I’ve been almost weekly reaching out to new people in various areas of the company and just having a 30 minute coffee chat with them. Um What do you do? How did you get to where you’re at? Um Are you reading any interesting books? Uh you know, these sort of conversations um which not only gets your name around, which is important if you’re trying to sell your work or something internally, maybe for a promotion or something like that, but you also just get an opportunity to meet new people. Um Anything else? Uh One more thing I would recommend would be document everything that you do, especially early on in your career, um writing writing down, I would keep a journal of some sort with you and every day write down what you’re doing at the end of the day, maybe what you accomplished or something like that. Um You don’t have to get too detailed or anything, but in six months when you’re uh talking with your boss about a promotion or you’re doing new interviews or something like that, inevitably they will ask you, you know, what have you accomplished or what have you, what have you been up to or doing? Um And it will be very helpful to have notes from six months ago about what you were doing, either on a day to day basis or um just large scale projects that you’re working on, um document the questions that you’re asking in your thought processes as well. People care a lot about how you think and how you solve problems, um And so, being able to articulate how you do that in an interview um is invaluable.
KS: Another question, how much does GPA matter?
AL: GPA doesn’t matter.
KS: Um I mean pass the threshold before you wanna be like above the three point as much as possible, you know? Typically typically. But it also depends on school. I think to like a lot of people will understand certain schools do great harder. Such a for grad school you’re applying, I don’t look at students differently to pander school. If you take a few years and get some industry experience before going to grad school. Uh the impact that GPA has drops dramatically. Um At least from what I understand. Yeah. Nobody has ever actually asking for my G. P. A. From undergrad or grad school except my apply to grad school. They want to know, I don’t think anyone really looked at Yet as you T doesn’t matter for job application, but to a certain point, if you put down like a 1.9, you know that calls and questions, but if it’s anywhere like above a three point people generally don’t care. Um If it’s like a 2.8 or something, but you’re from a really, you know you got a lot of job experience like Alex was saying that people don’t often even look at it and those forms you fill in the online job stuff, they’ll it just goes to a database which makes up like a profile for you. It doesn’t mean that every scouring the data of your profile um they use it for just like completeness. But you know like I wouldn’t put Gps on your resume no reason to have a funny conversation. A friend who my friends had a 4.0 and in grad school, I was like, you know that’s actually pretty bad because now you’re gonna have that people expect you to perform in a 4.0 all your interviews. So yeah, GPS kind of like it’s one of the things that people think matters a lot, but often is a, it doesn’t end as certain is often a poor indicator on the job performance as well.
KS: Does computer vision have a lot to do with data science?
AL: Yes, definitely. Data science is a terrible term because everyone has a different definition of it. I’m sure that Kyle and I have very different definitions of “data science.” But yes, computer vision is definitely an aspect of data science. I think most people would say that computer vision is a subcategory of machine learning, whether or not you call machine learning a subcategory of data science is maybe just a personal preference.
KS: Yeah, the best definition of data science is the one that gets the customer to pay the most amount of money.
AL: Speaking about getting a definition that the customer buys into, I think it’s important as a data scientist to separate the hype from reality. It’s very easy to get sucked into believing that this stuff is magic and that it can sell for an infinite amount of money. But I think in your undergraduate career you’ll realize that it’s not really that magical. Sure there are things that are unexplained, but at the end of the day, like I said, it’s just regressions, and understanding that will humble you.
KS: Yeah, it’s very hard to get customers to understand, especially working like an intricate science consulting role is not necessarily like a spec based process, like engineering can be where you can spec out software requirements. For data science, you often don’t know if you can have the right data to do the thing you want to do. That’s why communication is also so important because you have to really talk to the stakeholders and the people who want something and to understand better what they need, what they’re offering.
One last question: you mentioned transitioning between several labs. How do you know if the lab isn’t right, or isn’t right for you? And how do you leave a lab you don’t like without burning bridges?
AL: I think your undergraduate career boils down to the people that you’re working with. I didn’t feel welcomed in the first lab I was with, and I got the feeling that they thought I didn’t know enough to be a useful asset in the lab. That’s probably why I didn’t stick around with them. If you’re looking at research labs, talk to grad students over a cup of coffee. That’ll give you a very good indication of whether or not you want to work with them. They will be teaching you a lot, so if you’re asking them questions during your first meeting and they’re just not that good at explaining what they do or how they do it, that’s an indication that they might not be the best teacher for you.
If you’re looking for a research position, I would skip talking to the professor at all, and go straight to the grad students. They’re the ones doing most of the nitty gritty work. They’ve got their own projects and are the ones looking for undergraduate research assistants. Ask if they’ve got any projects that they might be looking for help on. If in a quarter or two, you just don’t see it going anywhere or you decide that your interests have shifted, it’s difficult to say, but just leave. You might you might burn a bridge, particularly with that grad student, but if your interests have shifted then it might not be that important of a bridge in the grand scheme of things. Your excitement on a topic and excitement for working with certain people could be more important. You might just let them know that your interests have changed and you’ll be looking for work elsewhere.
KS: Great, well thanks so much for joining us for classes Alex and taking the time and tell us your story and answer some questions and give some insightful feedback.