How to Become a Data Scientist
Since I’m fortunate to work with several Data Scientists, all with varying backgrounds and work histories, I’m often asked for career path advice by aspiring Data Scientists. What better way to advise than to share the experiences of the Data Scientists I work with most closely? I sat down recently with Oshry Ben-Harush and here is what he shared:
FC: How did you become a Data Scientist?
OBH: After spending time in the military, I decided to obtain a degree in Computer Science, specifically electrical/computer engineering. Following my BS I worked in Engineering at Intel. From there I decided to get my Masters in speech processing/time series signal processing and worked as a teaching assistant. I then started my PhD but after a year of research decided to take a role at CheckPoint (software security company) creating text monitoring software and developing algorithms aimed at preventing sensitive information from leaking.
In May 2012 I was recruited to EMC to build a Data Science team. I was curious as I’d never heard of Data Science. We were using machine learning but I hadn’t heard this term before. Building this capability using business data and adding value was very intriguing to me.
FC: What is your education background?
FC: What skills so you think are essential for a successful Data Scientist?
OBH: The following skills are required to be a Data Scientist:
- Integration – You need to know more than the science, understand the business, and the business problems. You need to talk with data experts and understand how to manage a project. You must have a broad view of the business.
- Rapidly learn and adapt – Projects sometimes have little to no similarities in relation to the content and domain knowledge. The technology/algorithms are your toolbox and these evolve with you, but you have to rapidly learn the domain in order to map the business problem to a form that is applicable by these set of algorithms.
- Simplify your findings – Present your findings simply but be informative. Being good at this requires devotion. Having excellent technical skills isn’t enough because you have to be able to explain it well or it will fall down.
FC: What are the common tools you use in your everyday work?
OBH: I use the following on a regular basis:
Text Editors: Vim
FC: What are the most important characteristics needed to be a Data Scientist?
OBH: Curiosity, persistence and “data-loving” nature.
FC: What is the coolest or most impactful project you have worked on?
OBH: Two projects come to mind. When I first started out, we were assigned to predict when servers would crash. We were trying to describe the behavior of the server to see when it was misbehaving and then predict when it was going to happen. This was a very successful project and was a lot of fun.
I’m working on the second project right now. It’s so exciting because we are exposing so much potential the sky’s the limit. I can’t share the details, since it’s not complete but stay tuned.
FC: How do you keep current with industry trends?
OBH: Personal development is a requirement and we ensure the team continues its education via formal training and industry conferences. Our Data Science team has the benefit of having a tight relationship with a local university. One team member maintains this relationship. We frequently attend training and run our own weekly seminars taught by my internal team.
FC: What guidance would you give aspiring Data Scientists?
OBH: Focus your attention on these three areas:
1) Get Educated – If your background is math-based, start with Statistics, go to machine learning algorithms and data analysis, then big data platforms. While taking these courses you can combine learning R and Python. You can find decent courses on Coursera and take them for free.
2) Get Real Examples – Look around your BU because you understand your challenges. Identify those challenges and try to apply machine learning algorithms and show them to your management chain. It won’t be easy but could be convenient because you know this domain. Many of the challenges that take time involve domain knowledge. Show two examples of how you solved a problem using Data Science. If you have a group of Data Scientists in your organization, offer up some of your time for free. With your manager’s approval, of course.
3) Get Involved in the Data Science Community – Participate in Meetup groups to discuss data science and data science projects.
A special “Thank You” to Oshry for sharing his experiences and suggestions! Hopefully this will help you aspiring Data Scientists on how to get started in a career of Advanced Analytics/Data Science. In future posts I will interview other Data Scientists to demonstrate their varying backgrounds and highlight their similarities and differences.