Real world problems are not only about modeling, it involves brainstorming on a lot of different situations. – Abhishek Mamidi

Interview with Abhishek Mamidi

Abhishek Mamidi

Data Scientist, YouTuber, Tech Blogger
Bengaluru, Karnataka, India

Q1. Please share your educational and professional journey?

  • I started learning Data Science when I was in BTech second year. I started my journey with Andrew Ng’s Machine Learning course, one of the best courses on ML. And then as part of curriculum, I took Deep Learning, Machine Learning, NLP and Computer vision courses. I worked on Data science projects in each of the above academic courses.
  • After BTech third year, I worked as a Data Science Intern at Monsanto wherein I worked on real world data science project. I learnt a lot during my internship. This internship gave me an opportunity to think differently.
  • In fourth year, I participated in a lot of hackathons and this gave me a lot of confidence in solving different data science problems.
  • After the completion of BTech, I joined ZS as a Data Science Associate. I worked on product development projects and client projects in Pharma industry. I am very happy to use my data science skills in solving real world problems.
  • Please check out my complete Data Science journey video here: https://www.youtube.com/watch?v=JaXTKz2kezQ
  • Blog: https://www.abhishekmamidi.com/2019/08/my-data-science-journey-and-suggestions-part-1.html

Q2. What is importance of Data in any Data Science Project? Is clean data helps to solve problem fast and save time?

  • Without data, there is no data science. We can extract insights only if we have quality data related to our problem.
  • When we work on a problem, we get data in different formats. The data can be structured or non-structured. We cannot ignore unstructured data because it might have some hidden information.
  • That’s why, it is suggested to clean data by removing unnecessary information from the data. This will help data scientists to focus on quality data and extract useful insights/information in less time.

Q3. Which one is the hardest working part of any Data Science Project?

  • In my view, I would say creating features from raw data is one of the hardest parts of a Data Science project. This requires domain knowledge and should know in and outs about the problem we are solving. Without good features, we cannot build a good model.
  • And the next hardest part is solving/approaching the problem itself. Real world problems are not only about modeling, it involves brainstorming on a lot of different situations. For example, if we are building a recommendation engine, we should think in all possible cases. In this case, the final algorithm should be able to recommend appropriate items to a new user or new item to existing user or new item to new user with limited information.

Q4. Which type of Machine Learning is highly used to solve the real-world problem?

  • Most of the Machine Learning problems are supervised problems. Labeled data is always better than unlabeled data because we will have prior information about the data. We always try to convert a problem into supervised problem.
  • If we go deeper, most of the problems are classification problems. Some real-world use-cases are Image classification, Sentimental Analysis, Spam/Ham email etc.

Q5. Many online courses on Data Science are running on the web. Which are the best Online Courses for Data Scientist?

  • Yes. There are many online courses on Data Science and it’s very difficult for a newbie to decide which course to take.
  • In my view, the below two courses are good to start.
  • If you are stuck at some concept/algorithm, you can always take help from YouTube videos/Medium blogs/stack overflow.

Q6. Why Kaggle is very important for any Data Science aspirant? Does Kaggle’s rank really help to get a job?

  • For students or data science enthusiasts, it’s very difficult to work on projects or real-world problems.
  • Kaggle is a place for data scientists to work on different problems and learn from kernels, grandmasters and peers for free. They provide you the data set, and problem and you can focus on problem solving and modeling. So, if you are getting started in Data Science, the Kaggle is the best place to work on problems and learn quickly.
  • When you participate in competitions on Kaggle or create kernels, your main goal shouldn’t be getting a high rank or high upvotes. You should mainly focus on building your skills. The journey on Kaggle platform will surely help you in getting a job.

Q7. You are running Data Science Blog and YouTube Channel. Please share something about both? What kind of stuff are you providing on it?

  • The main aim of Data Science blog and YouTube channel is to share my experience, data science journey and make technical concepts easier. There are a lot of concepts or algorithms that are used in the industry but are not taught outside. My focus is to share the concepts that are not easily available outside.
  • My blog and YouTube channel are for both Data Science enthusiasts and professionals. Anyone can learn from my articles/videos.
  • Data Science blog: https://www.abhishekmamidi.com/
  • YouTube channel: https://www.youtube.com/channel/UCr2uD7VzAGWjWG3BK8w_6jA/

Q8. What kind of problems have you faced in your Data Science Domain and how did you remove it?

  • I would like to give some tips to the reader who are reading this article:
    • In Data science projects, you experiment a lot. You try different combination of features, modeling techniques and preprocessing steps. It becomes difficult to track all the experiments after certain period. In these cases, ML Flow really helps to track and manage experiments.
    • Initializing random state in all the experiments you do is very important. This will help to reproduce the same results and avoid confusion.
Read more about Abhishek Mamidi @