Ravi Ranjan Prasad Karn
Data Scientist, Machine Learning, Deep Learning,
Artificial Intelligence, Natural Language Processing
Bengaluru, Karnataka, India
Q1. Please share your educational and professional journey?
- B E in Computer Science and Engineering (with honors)
- PGP in Data Science, Business Analytics and Big Data
- Pursuing Post Graduate Diploma in Applied Statistics
- 15 years of industrial experience
- Worked as Software Engineer at Centre for Railway Information Systems
- Worked as Software Analyst, Database developer and Data Analyst with Wipro
- Worked as Principal Engineer at Mercedes Benz Research and Development India
- Working as Data Scientist at Hewlett Packard Enterprise
Q2. What did attract you towards Data Science Domain?
I am fascinated about data, Mathematics and Statistics. I like working with data, doing analysis and find the story hiding in the data. I also loved Artificial Intelligence during my Engineering. My final year project in engineering was on Artificial Intelligence, though AI was not a buzz word those days. My love for data, Mathematics and Statistics and my relevant background and qualifications to become Data Scientist attracted me towards Data Science.
Q3. When NLP is required in any Project?
NLP works on human data which is based on the language: spoken(sound) and written(text), not numerical data. When we wish to do analysis on such data, we use NLP. Natural Language Generation (NLG) is used where we need to generate output in the form of natural language that human beings understand. NLP and NLG have use cases in various fields such as chatbots, brand awareness and market research, auto generated reports, news synopsis etc. NLP is required in any project when we want to use language data of human being to perform analysis or perform prediction.
Q4. Which one is your most used Machine Learning Algorithm in your real-world projects?
I have worked on many classical Machine Learning algorithms and Deep Learning algorithms for regression, classification and clustering such as Linear Regression, Logistic Regression, Decision Tree, Ensembles, SVM, RNN, CNN etc. But Decision tree has been my all-time favorite because decision rules can be interpreted which is transparent in many algorithms such as deep learning algorithms. Decision Tree is also tolerant towards outliers and null values. Our data had few outliers and null values for which we did not know how to treat them. Working in R&D field, we also needed to interpret the decision rules. So, decision tree fit my requirement in the specific projects I worked on.
Q5. How a fresher will become a Data Scientist. Please suggest a good path for aspirants?
For becoming a real Data Scientist, one should be good at theoretical concepts of computer science and software development, linear algebra, statistics and domain knowledge. Data Engineering, Data Analysis, Machine Learning Development etc. are the different aspects of data science. I would say that freshers should work on the theoretical aspects of Data Science if they have not acquired those knowledges during their studies. Then they should focus on one of the practical aspect of data science. They can work as Data Engineer or Data Analyst to start with and gradually they can work on all the aspects of Data Science to become a Data Scientist.
Freshers can go for trainings for the subjects they are not good at or they do not know at all. They can go for courses in the mode they can go for such as instructor based online, classroom or self-learning using books or materials available with themselves or on Internet based on their availability and money in pocket.
Aptitude towards data is the other thing that a fresher should have to excel in the field of Data Science.
With right set of knowledge and right set of aptitude freshers should choose one of the sub field of Data Science they want to work. They will become a Data Scientist in few years.
Q6. What kind of problems have you faced in your Data Science Project and how did you remove it?
Data Engineering (data acquisition and Data curation) is the part of Data Science projects where most of the Data Engineers and Data Scientists struggle in. 90% of the time and resources are used in Data Engineering itself. If input data is not good, it is not possible to get the desired output. Garbage in is Garbage out.
Like any other Data Scientists, I also faced challenges in Data Engineering.
To overcome this challenge, I focus more on this part. Finding the right set of data sources, getting right set of data and data curation by identifying outliers, filling unavailable data (NULL values in Data Engineering terms) and erroneous data is the area I work extensively so that rest part of the project goes well.
I would suggest every practicing as well as aspiring Data Scientists/ Data Engineers to focus most in data engineering field to make the project successful. It is also said that ‘ Well begun is half done.’ In Data Science projects “Well data engineering is 90% done”.
Q7. Where do you want to see yourself after 10 years in career?
I am working in Data Science field in industry. I am also enhancing my skills in writing (books/blogs etc.) and training people in Data Science domain. I do not know exactly where I will be after 10 years, but I aspire to become Chief Data Scientist of a company, one of the most read Data Science writer and one of the best trainers of Data Science.
Q8. What is the real use of Data Science in practical world?
Data collected from the system has hidden pattern which can explain in which condition data was generated. Data Science helps unveil that hidden pattern from the data. That hidden pattern helps in prediction and prescription based on new data that we acquire. Almost all the domains can use data science to take data driven decisions in real world. You name the domain; data science can be used in that domain to make prediction system and recommendation system. The potential of Data Science can be leveraged in all walks of life.
Q9. Which one skill do you like most about yourself?
I have good analyzing skills. Not only analyzing the data but I am also good in analyzing people. This is the skill; I like most about me. This helps me love data science field and work efficiently in the field of Data Science.
Q10. What are some gaps that you see in the field of data science right now?
I see data science in two parts: One dealing with descriptive and diagnostic analytics and the other dealing with predictive and prescriptive analytics. First one is doing pretty good. But in the field of predictive and prescriptive analytics there are some gaps between the speculation and what is there on the ground. There are lots of use cases. Some are doable use cases; some are hypothetical use cases. Also, availability of work force is immense but correct skillset is missing. But this filed is maturing and slowly all the gaps will be bridged.
Q11. Is a higher study required if one wants to go in the field of data science?
Going for higher studies is not required if one wants to be a data scientist but acquiring the all necessary knowledge and tools are required.
In my opinion, going for higher studies is good for working in Data Science field. Specialized full-time course helps a lot in understanding the underlying concepts which helps solve new problems. In specialized full-time course, one acquires all the required skills faster and in right way if the curriculum is designed properly. Other short-term courses can be beneficial to some extent, but higher studies and short-term courses cannot be compared. Self-learning will also take much time. There is a problem of right direction in self-learning. Mentoring is one another aspect. In full time course (higher studies), your professors would mentor you. It will help you launch yourself well in the data science space. Getting mentors at workplace is difficult to get. But if someone gets one, nothing better than that.
Get more about Ravi Ranjan Prasad Karn @
LinkedIn Profile: https://www.linkedin.com/in/ravi-ranjan-prasad-karn/
Twitter: https://twitter.com/raviranjankarn (@raviranjankarn)