Interview with Dhaval Patel
Data Engineer, Software Architect, YouTuber
New Jersey, United States
Q1. Please share your inspirational journey with us?
I was born and brought up in a small village in India but from childhood I had a fascination towards technology and the way it can enrich human life. After graduating with a B.Tech Degree in computer science I worked in India as a software engineer for 4 years and then I came to USA and have been working here since past 13 years as a Software Architect and Data Engineer.
Q2. What is the main key difference between Data Scientist and Data Engineer in any Real World Project?
Data engineer’s job is to maintain data infrastructure where as data scientist will use this data infrastructure and perform analytics on data. In this sense data scientist is actually a client for a data engineer. Data scientist is like a chef making say a biryani whereas data engineer is the one who will bring him raw ingredients such as spices, rice, vegetables. The role of data engineer is quite important in the data science process as without good quality data one can not perform data science.
Q3. You are running a YouTube channel on Data Science named “CodeBasics” which is very successful. Please share something about your YouTube Channel? What was your main goal behind it?
I started codebasics from my health struggle. In 2011 I was diagnosed with an incurable chronic disease called ulcerative colitis. I spent few years after that visiting one doctor after another to find a cure without any success. The disease was eating me up physically and mentally. I lost 31 kg of (or 70 pound) of my weight and I was getting into a state of depression. At some point I realized I need to do some activity to occupy my mind so that it is not thinking about my disease all the time.
The activity has to be something that I enjoy as well as I give back something to society. My father was a teacher and I enjoyed teaching since I was a child and YouTube was a perfect, passive way of teaching. Hence in 2015 November I started a channel called codebasics and started uploading python tutorials. As of Nov 2020 channel has 185 K subscribers and now my goal is to provide quality education for free to everyone.
Q4. Which one is better programming language between Python and R for any new Data Science Aspirant?
Both are good actually. Personally I am a big fan of python as it is a full stack development language. Not only you can do data science in it but you can even do web development or any other backend development. R’s strength lies in its statistical package. If you are still confused, please go with python.
Q5. What is importance of Data in any Data Science Project? What is the best procedure to clean data?
Good quality data is the most critical part of data science project. Without that data science project is useless, it follows garbage in garbage out principal where one can build the most sophisticated model in the world and yet if input data is bad it will give you crappy results. Pandas have been my go to tool for data cleaning but there are hundred other tools out there in market that lets you do this.
Q6. Why Kaggle is very important for any Data Science aspirant? Does Kaggle’s rank really help to get a job?
Kaggler rank definitely helps with a job. Gone are days where interviewers rely on your college GPA for hiring you. That has been replaced now with Kaggle or stackoverflow rank. Participating in Kaggle competition allows you to interact with some amazing folks and the idea of working in group on a Kaggle project always helps any data science aspirant.
Q7. What is the role of Deep Learning in Data Science Domain? Is NLP part of Deep Learning?
Role of deep learning is very important in data science. Data science can be used for doing descriptive analysis or predictive analysis. In predictive analysis deep learning plays a major role. Advanced deep learning techniques along with fast hardware (GPUs and TPUs) allow you to do things which were almost impossible to do ten years back.
You can do NLP without deep learning as well but nowadays any effective NLP requires using deep learning in some capacity.