Top 50 Data Science Questions

Top 50 Data Science Questions : Answers by Best Data Scientist

Top 50 Data Science Questions about Machine Learning, Deep Learning, NLP, Apache Spark, Cloud etc. and answers by Best Data Scientist

Data Science (AI-ML-DL-NLP-Cloud)

Q1. What is importance of Data in any Data Science Project? Is clean data helps to solve problem fast and save time?

Answer by Prakhar Mishra

To put it simply, What Data is to Data Science is same as What Oxygen is to life. Data Science is all about finding existing hidden patterns in data. With no data the whole purpose gets defeated. Indeed, clean data is one of the factors that can accelerate the problem solving process and help us come up with relevant insights. How much time it saves will depend on ones skills to make use of the data, choosing models wisely, experimentation throughputs and so on.

Q2. What is the main key difference between Data Scientist and Data Engineer in any Real World Project?

Answer by Dhaval Patel

Data engineer’s job is to maintain data infrastructure where as data scientist will use this data infrastructure and perform analytics on data. In this sense data scientist is actually a client for a data engineer. Data scientist is like a chef making say a biryani whereas data engineer is the one who will bring him raw ingredients such as spices, rice, vegetables. The role of data engineer is quite important in the data science process as without good quality data one can not perform data science.

Q3. What is the main use of Apache Spark in Machine Learning and Data Science Domain?

Answer by Srivatsan Srinivasan

Apache Spark is used to create scalable machine learning on large datasets. So if you have a dataset that has outgrown what python or R dataframe can handle, Spark becomes an obvious choice for many reasons, even though there are frameworks like Dask and others. Some reasons I feel works in Spark favor are

  • Support for Multiple Languages – Python, R, Scala and Others
  • Built-in scalable algorithms as well as support for most python ML packages like Tensorflow, XGBoost, LightGBM, Prophet, and many others
  • Seamlessly run the same job on-premise in your data center and on any cloud either with cloud native services like AWS EMR, GCP Dataproc, kubernetes, or with databricks. This way enterprise is always cloud ready
  • Connectivity to significant data sources – HDFS, RDBMS, Cloud-based storage, Cloud data warehouses, file systems, and many more
  • One another reason which I feel works in Spark favor is investment across many enterprises that have embarked on their Big Data journey

Q4. Does a good hands on Statistics would be more helpful to become a good Data Scientist?

Answer by Madhukar Kumar

Definitely!! If data science is the human body, Statistics is heart of the body. I don’t know how this notion has come in the market that you can excel in data science without learning statistics. That’s not true. You can’t be a good data scientist if you don’t understand the statistics behind the algorithms. You would be clueless if your algorithm is not performing as per expectation and you don’t understand the statistics & theory behind the algorithm. You can only fix it if you know the statistics & theory behind the algorithms.

Q5. Please Share something about your book “Approaching (Almost) Any Machine Learning Problem” and how it is different?

Answer by Abhishek Thakur

Approaching (Almost) Any Machine Learning Problem is not a traditional book. It is a code heavy book that dives into applied machine learning and data science with good coding practises. The book starts from setting up an environment for working and goes deep into feature engineering for categorical and numerical features. I talk about hyperparameter optimization and feature selection.

I also talk about how one should arrange their machine learning projects. And not only that, I dive a little bit into image based problems and natural language processing and close the book with how one can create reproducible models and distribute their code. There are many things in the book you won’t find in traditional books. The best way to get the most out of the book is to code-along. I also made the book much cheaper so that many people can afford it. If you don’t like coding, you shouldn’t buy the book.

Abhishek Thakur Book can be bought from Amazon: https://www.amazon.com/Approaching-Almost-Machine-Learning-Problem/dp/8269211508

Q6. Which cloud platform is best to deploy any Machine Learning model in real business world?

Answer by Srivatsan Srinivasan

In my personal opinion, top 3 cloud platforms are almost equally placed for ML and AI capability. Even if any of the providers are lagging on any capability, they might not be very far from catching up with it. Selection of cloud platform for ML purely depends on organization data strategy on cloud. It is not ideal to have data on one cloud and ML on another (even though possible). Selection of the cloud has to be looked from application, data, and ML capability.

Coming to the capability of cloud, I feel every cloud today has some uniqueness. GCP with its AutoML suite and better Cloud API accuracy, Azure with its nice drag and drop UI for model creation and deployment and AWS with similar capability.

Q7. Which one is better programming language between Python and R for any new Data Science Aspirant?

Answer by Dhaval Patel

Both are good actually. Personally I am a big fan of python as it is a full stack development language. Not only you can do data science in it but you can even do web development or any other backend development. R’s strength lies in its statistical package. If you are still confused, please go with python.

Q8. Which classification problem have you faced maximum between Supervised and Unsupervised classification?

Answer by Krish Naik

Supervised Classification.

Q9. What are some gaps that you see in the field of data science right now?

Answer by Ravi Ranjan Prasad Karn

I see data science in two parts: One dealing with descriptive and diagnostic analytics and the other dealing with predictive and prescriptive analytics. First one is doing pretty good. But in the field of predictive and prescriptive analytics there are some gaps between the speculation and what is there on the ground. There are lots of use cases. Some are doable use cases; some are hypothetical use cases. Also, availability of work force is immense but correct skillset is missing. But this filed is maturing and slowly all the gaps will be bridged.

Q10. Where does NLP require in Data Science Projects?

Answer by Ramsri Goutham Golla

NLP is on the rise in data science with the advent of new transformer-based models. Also, GPT3 created quite a buzz in the AI space. A lot of data in an organization is text-based so finding insights and drawing business value is essential. So NLP is becoming increasingly more important in data science.

Q11. How Artificial Neural Network models works like Human Brain and detect problem automatically?

Answer by Abhishek Kushwaha

One can compare Artificial Neural Network models with the Human Brain from the perspective of neurons as How in Humans various neurons are connected and signals from these neurons pass the information and based on it the brain acts.  In the same way, ANNs have neurons connected in chained fashion and information flow is happening. But ANNs are nowhere close to how the brain works. ANNs need a lot of data to learn while we humans don’t. ANNs do not learn the structure of the world as a whole. ANNs trained on one specific task cannot perform a different task but we humans can extrapolate our understanding of the world and perform tasks even if we have not been trained for it.

Q12. How NLP is helpful to solve real world problem?
Answer by Sudhanshu Kumar

NLP can contribute a lot towards solving our day to day problems. By now, you’d have must had a conversation with a bot and they might be so good that you never noticed it. Your mobile autocorrect, predictive texts, Gmail now has started predicting your email sentences, social media-driven marketing, products and election campaigns, targeted advertisements based on your search history, enterprise application log analytics, their optimization recommendations, AI newsreaders, Bots which can generate stories, auto language translators and whatnot. The world has changed a lot, we are keeping pace with it, and AI is the tool.

Q13. Where “Time Series Modelling and Analysis” is used and what are the benefits of Time Series in real world?

Answer by Srivatsan Srinivasan

Time Series has always been critical for businesses, be it in retail for sales forecasting or product demand forecasting, in energy for demand forecasting, and across industries in call demand forecasting, workforce forecasting etc. In today’s world with IOT, AIOps, Predictive Maintenance among other Time Series models have even more relevance.

Most institutes and courses focus on traditional machine learning, deep learning, and less on Time Series. There has been a sea of change in Time Series modeling over time, with newer tools like Facebook Prophet, DeepAR, LSTM etc., showing State of Art results in some cases. I think my Time Series course has good popularity due to coverage of both statistical and advanced techniques and demonstrating it with some useful real-world dataset. Few aspects in my view that differentiates the course are the coverage on scaling time series using Spark, Time Series Anomaly detection, Multivariate and Multiple time series apart from the regular Time Series modeling coverage.

Q14. Which type of Machine Learning is highly used to solve the real world problems between Supervised and Unsupervised?

Answer by Rajdeep Pal

Unsupervised learning is still something that’s not very deterministic. Obviously, we use clustering and other unsupervised algorithms but supervised learning algorithms are still mostly used in the industry primarily due to the vast amount of data available, and also due to their deterministic nature.

Q15. What kind of challenges have you faced in AI and Data Science Industry?

Answer by Avishek Nag

Biggest challenge is bringing the Data Science awareness. Many organizations start with lot of enthusiasm, but after certain period when it is the time to see the result in real environment, people lose tracks. It happens due to lack of awareness and benchmarking about Data Science. People misunderstands the life cycle of AI/ML based applications. It involves fair amount of research activities and it may happen that after investing sufficient time nothing came out. It does not go like traditional water-fall or even modern-day agile methods. It has its own way. People have to be open about it.  Ultimately, Data Science practice adds lot of values in any org of today’s world. It can bring significant growth in long run.

I have observed, many times Data Scientist/ML practitioners are misunderstood by the people & senior management. It is driven by fear and lack of awareness about the subject which is not desirable at all.

It is high time that Industry should invest time in benchmarking and bringing process in this area. In some places this activity has already started but still lot of things need to be done.

I feel Data Scientist like me should come forward and educate people about the subject through their books, blogs, videos, lectures and many other possible ways. After all it is the future.

Q16. How Deep Learning is important to solve the real world problem instead of Machine Learning?

Answer by Dr. Pradeep K Mavuluri

Both exist or develop for solving the real world problems, however, their application depends on the data availability, scalability and other issues you are trying to address.

Let’s take example of email spam detection, it evolved over a period, earlier machine learning now deep learning. However, with respect to loan approval process, still machine learning, they don’t see need of immediate deep learning here.

Q17. How do you help as a Data Scientist in the Social Media Platforms?

Answer by Vaibhav Saxena

Social Media is an interesting domain that entails a whole new dimension of challenge. The other domains do not have really force you to optimize your models in production to cut the inference time but in this very domain, this challenge in brutally pervasive. Making a good model (which is itself a challenge in Social Media) is one thing, making it respond lightening fast is totally another. As for the tasks, the data is full of user’s conversations in the form of posts, comments and messages.

You have to work on making the user’s experience better and come up with ingenious ideas on how to build on those ideas and complete the tasks with minimal cost incurred. Tasks like delivering the perfect search results, recommending the most related friends/pages to users, text autofill, control hatespeech are notable mentions. However, the key is the optimization, not only how good your model is but also how fast is it.

Q18. What is the role of Deep Learning in Data Science Domain? Is NLP part of Deep Learning?

Answer by Dhaval Patel

Role of deep learning is very important in data science. Data science can be used for doing descriptive analysis or predictive analysis. In predictive analysis deep learning plays a major role. Advanced deep learning techniques along with fast hardware (GPUs and TPUs) allow you to do things which were almost impossible to do ten years back.

You can do NLP without deep learning as well but nowadays any effective NLP requires using deep learning in some capacity.

Q19. How do you see AI and Data Science Domain as a job producer in next 10 years?

Answer by Madhukar Kumar

AI & data science are tools to solve the business problems, Earlier problems were simple and that’s why standard & traditional statistical techniques were able to solve various business problems. But now data is getting bigger & bigger and problems are becoming more complex with each passing day, There is no way we would be able to solve these problems without AI & data science. So, in my view, demand for professionals, who can efficiently use various algorithms to solve business problems, is only going to increase in future.

Q20. Is NLP part of Deep Learning or something different? Where can anyone use NLP?

Answer by Shivam Kotwalia

NLP is a very broad term and industry has been using it everywhere.

In my understating text is a cognitive vertical like images, speech, the relationship that exists between text and NLP is similar to images and image processing. Deep Learning has entered the domain quite lately and revolutionized it, with the release of attention based models, transformers – NLP is going wild.

Regarding, the usage – text is more freely available than any other form of data. You just name the domain and NLP can be applied, from building a fail safe campaign for M&S team or check your employee NLP score for HR or product categorization & supplier negotiation for Procurement teams.

Q21.  What is the future of Artificial Intelligence, ML and Data Science? Is it good career option for students?

Answer by Harendra Kumar Dadhich

The future of AI, ML and data science are so bright, as I said previously data scientist is the sexiest job of the 21st century. So, if one becomes data scientist he will gain prestigious position in the company along with good salary in hand. In India, the average salary of a data scientist is more than any other field jobs like software engineer, Java developer and many more. AI is the future of the earth, slowly- slowly it will cover all fields. So, it is very good options for the students.

Q22. NLP is performing very well in Healthcare Domain. How it is useful in other industry?

Answer by Abhishek Kushwaha

In real-world text+voice constitute 70-80 % part of our life, so NLP has much more application than computer vision. In the Domains like Law, NLP is being applied to help the lawyers prepare a strong case for its clients by extracting relevant points from past orders and laws. Again, Chatbots in customer service is the biggest application of NLP.

Kaggle

Q23. What is Kaggle?

Answer by Usha Rengaraju

Kaggle is one stop playground for data scientists where you will find competitions catering to all skill levels. Data Scientists can explore and run ML code with Kaggle Notebooks hosted on cloud and also explore, analyze and share high quality data.

Q24. Why Kaggle is very important for any Data Science aspirant? Does Kaggle’s rank really help to get a job?

Answer by Shruti Bhutaiya 

Kaggle provides an environment/stage to Explore, Learn, Enhance, Showcase your skills in Data Science. It is a virtual hub where data science aspirants across the world gather and share their knowledge and experience.  Kaggle’s rank definitely helps to get a job as it showcase your skills, talent and hard work.

Q25. You are the world’s first Kaggle Quadruple Grand Master. How have you achieved this Milestone?

Answer by Abhishek Thakur

If I had to answer this question in one word, I would say, perseverance. A never-ever-give-up attitude and learning from the solutions of best data scientists helped me a lot.

Online Courses

Q26. Many online courses on Data Science are running on the web. Which are the best Online Courses for Data Scientist?

Answer by Abhishek Mamidi
  • Yes. There are many online courses on Data Science and it’s very difficult for a newbie to decide which course to take.
  • In my view, the below two courses are good to start.
  • If you are stuck at some concept/algorithm, you can always take help from YouTube videos/Medium blogs/stack overflow.

Q27. What do you think about Online Data Science Courses? Do these courses sufficient to make Data Scientist?

Answer by Shruti Bhutaiya 

You have to choose a website and a course precisely. Not every course provides enough information. Explore it first, read blogs and then only choose. These courses are not sufficient but they are necessary. You need experience to be a data scientist but you also have to go through the necessary courses at least once.

Q28. You are running a YouTube Channel of Data Science. What was your main motto behind this online venture? Please share some brief about it?

Answer by Prakhar Mishra

In the last few years, research in NLP domain has boomed. I realised that NLP practitioners and enthusiasts seldom refer to these research papers due to the complexities involved. Hence, I started this YouTube channel to demystify the cutting edge research going on in this domain also I strongly believe that “The best part of learning is sharing what you know”. This was the major motivation that pushed me to kickstart this passion project of mine and now it’s been around 5 months since i started my Youtube Channel, TechViz – The Data Science Guy.

Become a Data Scientist

Q29. How a fresher will become a Data Scientist. Please suggest a good path for aspirants?

Answer by Ravi Ranjan Prasad Karn

For becoming a real Data Scientist, one should be good at theoretical concepts of computer science and software development, linear algebra, statistics and domain knowledge. Data Engineering, Data Analysis, Machine Learning Development etc. are the different aspects of data science. I would say that freshers should work on the theoretical aspects of Data Science if they have not acquired those knowledges during their studies. Then they should focus on one of the practical aspect of data science. They can work as Data Engineer or Data Analyst to start with and gradually they can work on all the aspects of Data Science to become a Data Scientist.

Freshers can go for trainings for the subjects they are not good at or they do not know at all. They can go for courses in the mode they can go for such as instructor based online, classroom or self-learning using books or materials available with themselves or on Internet based on their availability and money in pocket.

Aptitude towards data is the other thing that a fresher should have to excel in the field of Data Science.

With right set of knowledge and right set of aptitude freshers should choose one of the sub field of Data Science they want to work. They will become a Data Scientist in few years.

Q30. How a non-computer science student will become a Data Scientist. Please suggest a good path for aspirants?

Answer by Gaurav chatterjee

A non-computer science student, first of all, has to improve upon the problem-solving skills and learn how to code effortlessly. Students can target both the above-mentioned skills by learning Data structures & Algorithms and coding it. Also, DSA is the building block of computer science so it will definitely help you in the journey.

After the  student is comfortable with problem-solving and coding skills they should do the following things:

  1. Opt a good course on Machine learning and study it thoroughly to become well versed with the concepts.
  2. Practice the machine learning problems on Kaggle which will help you gain confidence, also it will give you enough hands-on skills.
  3. Post your projects on GitHub and LinkedIn and also you can use youtube to showcase your projects.
  4. Time to market yourself. Make a clean and creative online portfolio and a strong resume based on ML. Start applying to your desired companies and surely circumstances will bend in your favour and soon you will become the Data Scientist.

Q31. How to become a professional Data Scientist?

Answer by Saikat Basak

It will require A Lot of passion and hard work.
One can start by enquiring what a Data Scientist actually does in their day-to-day life. And that enquiry must include their life outside of work because a lot of things go differently when you are a Data Scientist. A Data Scientist needs to constantly polish and upgrade their skill sets as there is a huge inflow of new knowledge, new tools, new competition. So before one jumps into the deep they must realize what to expect. It will be tough for you to stay relevant in this job market if you aren’t upgrading yourself constantly.
Having said that, the path to Data Science is not any secret. Rather it is one of the most democratized areas of study. Almost all big technical institutes have their AI/ML/Statistics curriculum made publicly available. And their hundreds and thousands of MOOCs (Massively Open Online Course) on the topic.
One might start with the basics of Statistics or Linear Algebra but needs to switch to a smarter learning policy after they have refreshed their 11th and 12th grade knowledge. That smarter learning policy might include choosing one particular domain of ML i.e. Classical Machine Learning (this is anyway important), Natural Language Processing, Computer Vision, ML on Structured Data etc.
Choosing one particular field will help the newcomer in staying focused in the plethora of ML content that is scattered throughout the Internet. Now, it’s time to choose some interesting problems. You want to solve the problem of detecting Cancer? Or do you want to identify Fake News? Choose a problem that interests you. Try to solve the problem and learn while you do it.
Upload your solutions on GitHub or Kaggle where the world can inspect, criticize, or use the solutions you have developed. This will boost your knowledge, confidence and resume. A quick inside note, when we look for a Data Science candidate we look for people who have some experience on working with real problems. Your GitHub or Kaggle profile is an excellent way of showcasing your work.

Use of Data Science

Q32. What is the real use of Data Science in practical world?

Answer by Ravi Ranjan Prasad Karn

Data collected from the system has hidden pattern which can explain in which condition data was generated. Data Science helps unveil that hidden pattern from the data. That hidden pattern helps in prediction and prescription based on new data that we acquire. Almost all the domains can use data science to take data driven decisions in real world. You name the domain; data science can be used in that domain to make prediction system and recommendation system. The potential of Data Science can be leveraged in all walks of life.

Q33. How Data Science is useful for Life Science?

Answer by Usha Rengaraju

Data Science has huge applications in life science and the advancements in the computational power has propelled research and progress in several promising areas. Neuromarketing is an evolving field where unconscious responses to marketing stimuli is measured from fMRI and other imaging techniques. Genomic data science opens up new possibilities for personalized medicine and has a potential to reshape drug discovery.

Q34. What is the real use of Data Science in the practical world?

Answer by Gaurav chatterjee

In Data Science we take insights from data and generate future predictions from it therefore some of the practical uses of data science are:

  1. Self Driving Cars
  2. Stock market prediction
  3. Pothole detection ( I have made this project, you can have a look at it here: https://www.youtube.com/watch?v=iqkPU2CUfh8 )

Q35. How Data Science will be helpful in business growth?

Answer by Manu Bohra

Ans: Data speaks a lot about itself. Data Science has the power of handling data analytically and getting insights from it. Data science can help companies to get insights from the good quality data to make business decisions while maintaining a business process. Everyday lot of data is being generated by companies through some business processes which have the power to boost their business with sustained growth in a very much controlled way and in correct direction.

Problems faced in Data Science Domain

Q36. What kind of problems have you faced maximum in your Data Science Project and how did you remove it?

Answer by Ashish Pal

Data …. Yes Data collection one of the biggest problem and I believe everyone is facing that. After that understanding data, domain and data pre-processing all these steps are crucial and until you will have real time experience you will not get it.

So, It’s a loop you go through multiple cycles of different experiments and prototypes, going through different research papers and trying out different strategy to actually figure out what works best for your problem and last but not the least transfer learning of not model but techniques and strategy.

Q37. What kind of problems have you faced in your Data Science Projects and how have you removed it?

Answer by Saurabh Bhatt

The main problem of any data science project is to productionize it for the end user. Also, it is very important to understand who the end users are. To solve such issues first I keep myself in the place of end user and think in multiple ways that what could be the things which I am missing. Then create a list of those issues and talk to the real end users. After getting their feedback, I try to build the complete workflows which suffice all the requirement.

Q38. What kind of problems have you faced maximum in your Data Science Projects and how did you remove it?

Answer by Shubhajit Panda

Mostly Data processing, Getting proper data, Trying to find pattern of data and as per the requirement which Data we should need.

Favourite Machine Learning Algorithm

Q39. Which one is your favorite Machine Learning algorithm as a data scientist?

Answer by Krish Naik

Xgboost is my favorite Machine Learning algorithm.

Q40. Which is your favorite Machine Learning Algorithm/Model and why?

Answer by Manu Bohra

LDA (Latent Dirichlet Allocation) is my favorite and its an unsupervised machine learning algorithm that takes documents as input and finds topics as output. LDA is much powerful with tf-idf for topic extraction and to analyse large text corpus and capable of giving what percentage each document talks about each topic. LDA can be used in process of graph database creation by analysing large text corpus and getting the most prominent topics from that. Above is the great and advance level application of Natural Language Processing.

Q41. Which is your favorite Machine Learning Model to solve the real world problems?

Answer by Avinash Navlani

Predicting satisfaction and frustration level of a candidate in an online exam using mouse movement data.

Q42. Which is your favorite Machine Learning Model and why?

Answer by Ashish Patel

Tree based model, which is treat the missing values.

And Many More Data Science Questions

Q43. Which one is the hardest working part of any Data Science Project?

Answer by Abhishek Mamidi 
  • In my view, I would say creating features from raw data is one of the hardest parts of a Data Science project. This requires domain knowledge and should know in and outs about the problem we are solving. Without good features, we cannot build a good model.
  • And the next hardest part is solving/approaching the problem itself. Real world problems are not only about modeling, it involves brainstorming on a lot of different situations. For example, if we are building a recommendation engine, we should think in all possible cases. In this case, the final algorithm should be able to recommend appropriate items to a new user or new item to existing user or new item to new user with limited information.

Q44. Which one thing do you want to change in Data Science Domain and why?

Answer by Ayush Aggarwal

Data science is a very diverse topic. It includes different skills – visualization, data architecture, programming, storytelling, and so on. In my view to know about the problem which we are trying to solve we should have an end to end understanding of data science so that we can know all constraints and options which we can implement. If more focus will be on end to end things rather than learning new models it would be great.

Q45. Is a higher study required if one wants to go in the field of data science?

Answer by Ravi Ranjan Prasad Karn

Going for higher studies is not required if one wants to be a data scientist but acquiring the all necessary knowledge and tools are required.

In my opinion, going for higher studies is good for working in Data Science field. Specialized full-time course helps a lot in understanding the underlying concepts which helps solve new problems. In specialized full-time course, one acquires all the required skills faster and in right way if the curriculum is designed properly. Other short-term courses can be beneficial to some extent, but higher studies and short-term courses cannot be compared. Self-learning will also take much time. There is a problem of right direction in self-learning. Mentoring is one another aspect. In full time course (higher studies), your professors would mentor you. It will help you launch yourself well in the data science space.  Getting mentors at workplace is difficult to get. But if someone gets one, nothing better than that.

Q46. What is the role of Python in Machine Learning, Deep Learning and Data Science?

Answer by Vishwas BV

Python had become the go-to language for data science, this demand was born out of the programming language’s versatility. Python is easy to learn has a wide community base and there is a package for almost everything. 

Q47. What did attract you towards Geospatial Data Science?

Answer by Soumya Kanta Dash

During my coursework and research, I observed that AI and machine learning combined with geospatial technology could solve various unanswered questions widely searched by the industry and which could lead to efficiency in previous traditional workflows. I realized that Geospatial data is best suited for machine learning applications due to huge availability of data and capability to provide answers in complicated scenarios.

Q48. Is it early stage of Data Science Domain or it has come on saturation point?

Answer by Ashish Pal

I will say it is just a starting point. Just in last 2 years, people have actually realized that it is not only the Hype but also great possibility and there is no escape. When everyone is the world is harnessing the data, creating winning solutions which no humans have ever thought is possible. The pace at which advancement is happening is uncontrollable. So even if you want or not you have to be the part of this race if you want to survive.

 So from Research point of view, great work is happening everywhere and now this COVID has worked as a catalyst in bringing those research works to actual production which will change everything around the world. Those companies and products that were ahead in this digital journey have in cashed this COVID situation whereas those businesses who were still working in their traditional way have suffered allots. Therefore, it is clear this is just the starting of new era.

Q49. Which one is the hardest working part of any Data Science Project?

Answer by Mayur Domadiya

Most hardest part of any Data Science Project is Data Collection and Data Cleaning. Only this step in data science take long time if you don’t have proper data then your model can not learn patterns so always focus this step.

Q50. You have both software development and data science experience. Which one do you like most and why?

Answer by Kunjan Shah

Here, you have got me wrong. I am not working as a Data Scientist. I am a DotNet Developer. I am an aspiring Data Scientist who is fighting hard to bag an opportunity. And, there is a dead lock here. To get an opportunity into Data Science, Recruiters ask for Project experience. But, to get experience, you need one opportunity. I reached out to My LinkedIn contacts for this. And, many have given me advice to go for Higher Studies too.

Considering Covid-19, Company budgets for R&D and New projects have been drastically reduced. Company Management wants to keep the existing projects running at top priority.

Fortunately, I have two kind hearted Seniors who are mentoring me and occasionally giving me opportunities to work on Data Science projects. So currently, I am working as an On-Request Machine Learning Engineer at my company itself.

I started my IT career as a Desktop Developer – WinForms. To survive in IT industry and for my own better technical and financial growth, I needed to learn something else too. I picked Web Development and Data Science as two options. I tried both fields. I did certifications and online courses on ReactJS, Angular, ASP.Net and Data Science. 

Among all, I found Data Science more interesting and more suitable for me. I am quite good in Mathematics. So, I understand Machine Learning related math to a very good extent. I find it challenging to perform Data Analysis and Data Pre-Processing. Data Science requires good amount of brain storming for any problem statement you are working on. It expects you to do critical thinking.

Gradually I started spending more time on Data Science. Currently, I am focused on building a good Portfolio on GitHub and Kaggle.

One thought on “Top 50 Data Science Questions : Answers by Best Data Scientist

Leave a Reply

Your email address will not be published. Required fields are marked *