The Data Science Interview

The Data Science Interview

Posted by reddysumanth

Updated: Sep 23, 2015


As my peers and I drew closer to the conclusion of our data science bootcamp, we started to turn our attention to the job market. A number of people expressed anxiety over the tasks ahead. Most of them had not even tested the data science job market, and were worried about many parts of the process. In a sense, I knew exactly how they felt since I had been testing the market for almost a year prior to my enrollment in the bootcamp.

That anxiety stems from the uniqueness of each interview; what questions are going to be asked, and what do I need to say to impress the person in front of me? This is a lot of pressure when you are making a career change. That said, after completing this bootcamp, I no longer have this anxiety. Prior to the bootcamp, there were a number of questions I was getting asked where I was not even familiar with some of the terminology being used. It was intimidating, but remembering the questions from back then and comparing them to the questions I'm being asked now has made me realize that every company hiring in data science is following a template. Once you understand what that template is, if you are competent in statistics and programming, these interviews should be a breeze.

The Job Description:

Ignore the required experience. Don't even look at it. Go straight to the job responsibilities and see if you meet their programming needs. That is the only 'requirement' for any job in data science. I used to count myself out of certain jobs where I didn't have the required experience for, and some of my bootcamp peers expressed this same feeling. Despite this, recruiters were reaching out to me from my linkedIn or Indeed profiles for job descriptions where I did not meet their required experience!

It's just a smokescreen to filter out people who are truly unqualified for these kind of jobs. It's easy to lie on an online profile, and as a result, recruiters have a tough job assessing who actually has the skills to come and contribute for their company. So the only thing you need to worry about are the programming skills. If you have what they are asking for, you are qualified to compete for that job.

The Application:

Most of this should be straightforward, but be sure to organize your resume well. Your skills and most recent, relevant experiences should be up near the top. Recruiters aren't going to spend much more than 10-15 seconds glancing over your profile before they make a decision to continue researching you or moving onto the next persons resume. You're competing for their attention in those few seconds, so make sure you have the best information about yourself near the top, along with your contact information.

I'm on the fence about writing a unique cover letter for each job. Of the last two company-specific cover letters I've written, one is interested and the other hasn't even reached out to me. I've also received interest from companies where I put out a generic cover letter. It's up to you how much effort you put into this, but if you are really interested in the job, I would suggest reflecting that interest in the cover letter. It's the only way to get the recruiter to see your personality and what kind of commitment you will bring.

The Recruiter:

In my opinion, this is the easiest part of the process. It's just a 10-15 minute conversation with someone who wants to hear about your history and who you are. They will not know any technical details, and you don't need to worry about impressing them in any way. Just let them know what you've been doing in the past, and mention some projects you may or may not have worked on. It really doesn't matter. As long as everything sounds reasonable and you don't come across as a crazy person, you should move onto the next stage every time.

The Technical Interview:

This can slightly vary between companies. Some will immediately test your coding skills by asking you to white-board on a screen-share meeting. If they do, you need to practice your coding skills. I recommend Hacker Rank and Leetcode for websites that will have similar problems to the ones you will encounter. Make sure you can complete the basic algorithms in the necessary language.

While a coding test is always part of the process, I find that most companies prefer a phone conversation first. This stage is where the uncertainty of the conversation truly comes into play, and many people can freeze up. There is only one way around this; you have to feel prepared. The recruiter will tell you the name of your interviewer if you ask, and you should immediately look up their profile on linkedIn. Specifically look up their work experience and skills so you know what their areas of expertise are. You can gauge the difficulty and type of questions that are coming just by gathering this information.

Once you know a little about your interviewer, there are specific types of questions that you have to be ready for. Be ready to talk in detail about your most data science-relevant project work. This is extremely important; you must be able to speak confidently about the work you have done. Think about each project and have an answer ready for these three questions: What was the problem? What was your solution? Were their any obstacles you had to overcome along the way? This is your opportunity to impress the interviewer with some work that they may not have any knowledge of while also demonstrating your capabilities.

You will always be given one or more mini test cases, where you are asked to think about a problem and give your ideas on how to proceed with it. There may or may not be one correct answer they are looking for, as this is usually about them trying to see how you process a problem. "A case study records the transactions of 100 supermarket customers. 70 were men, 30 were women, do you see any issues that might arise from analyzing this data?" Don't be afraid to ask questions; you are a data scientist, and you should always welcome more information. Make sure you fully understand the question before you try to answer it, your interviewer will be happy to try and explain it as clearly as possible.

That is most of the heavy lifting, but don't be surprised if your interviewer sneaks in a simple statistics question somewhere. What is a confidence interval? How is standard deviation calculated? These will not be challenging if you have even a basic knowledge of statistics.

The Coding Quiz:

Some companies might prefer this over a technical white-board, some might do both. Some might even do neither and proceed straight to the on-site white-board. Also, the order of these can vary.

For those that do give a quiz, it will usually be a test-case where you are given data and asked to perform computations on it, and or make predictions. You may need to deal with missing data. This might be a challenging exercise, but if it is, you will be given ample time to complete it on your own.

A few examples:

One company gave me an excel sheet of very small data tables and asked me to compare categorical groups. This required joining the data tables and aggregating data. They gave me a flexible 2-3 day window in which to submit the exercise, which was far more than was needed. This exercise was simplest in SQL, but considering the ample time, I used Python in order to produce some graphs.

Another company gave me two test cases: one was a dataset of 5000 observations with 254 predictors (numerical and categorical) and one target variable. I was asked to build a model and submit my predictions for a test set. For the second test case I was asked to perform queries on a set of text files that first needed to be aggregated. This coding test was far more challenging than the first example I gave, yet they asked me to submit my solutions within 24 hours of opening the test file. This just goes to show that companies are going to vary in how hard they test you during the interview process, and will vary further in how competitive this process is.

The On-Site Whiteboard:

For some companies, if you have already done some white-boarding or coding quizzes, this might be a final interview with the management. That said, you should expect the final stage of your interview to be an on-site pseudo coding exercise. It seems pretty unfair to subject data science candidates to this pressure since it doesn't really apply to any part of the job, but it is what it is. I'm guessing that in addition to helping them eliminate poor performers, it gives them insight into what the work environment around you might be like.

If you haven't been doing it already, practice on Hacker Rank and Leetcode. Pick only one language that you feel strongest with. Make sure you look over the discussion forums, where people will share their solutions. The most efficient solutions will not be shared, but you should always be able to see the framework of a correct solution to any problem on these sites. If you are new to this, I suggest starting with Hacker Rank. You can see more details about your results than on LeetCode, which will reject your code without telling you what went wrong. However, if you want to compare the efficiency of your code to other programmers, LeetCode is nice as it will show you where your efficiency stands.

If it's just a bunch of interviews with management, then be prepared for all the same things that you were in the technical interview. Be able to talk about your projects and what kind of skills you can bring to the table. Remember, they want to know how you can make them money, so be sure to sell yourself, without coming across as arrogant.

But most importantly of all....RELAX. If you made it this far, then the company clearly thinks you have what it takes to come in and get the job done. They just want to make a few final checks, and as long as you haven't been depending on smoke and mirrors, you should be good to go. Feeling stressed out and anxious is not going to help anything. Even if you nail everything perfectly, there might be another candidate who just happened to blow them away. It's just one job, and even if you don't get it, you can get the next one. Get up to that white-board and do the best can you do with a clear mind.

If all of the above sounds reasonable to you, then getting a job in data science should not be a problem for you. The market is ripe, and you should keep firing applications out there.


One final note - If you are very well qualified but not getting the job:

I have met a couple well qualified people who have expressed difficulty in landing a job. Specifically they had passed all coding challenges, gotten to the final interview, but did not receive an offer. It is difficult to say with any certainty what the issues are in these cases, but it can often just be bad luck. There may have been better candidates, or the manager with the most sway just didn't like you as much as someone else. A lot of these companies are creating a data science team for the very first time, and aren't even sure what the best candidates might look like.

Unfortunately it may also come down to communication skills. Pay attention to the facial reactions to your statements. Try and remember if you saw a reaction that was less positive than the rest, because that may have been a tipping point. This is very much a “feel it out” part of the process.

Apply for the Upcoming NYC Data Science Bootcamp

The first step in becoming a data scientist is to complete your Data Science Bootcamp Application.  Just click the button to apply.  It's free and will only take you about 5 minutes.


Apply to NYC Data Science Bootcamp


View all articles

Topics from this blog: data science Community

Interested in becoming a Data Scientist?

Answer 3 Simple Questions and Get Immediate Course Recommendations.