Interview with Prakhar Mishra
Research Scholar (Data Science)
Blogger, YouTube Channel (TechViz – The Data Science Guy)
Bengaluru, Karnataka, India
Q1. You have a huge interest in NLP. What is the main reason behind it and where we can see real example of NLP in this world?
My journey of NLP research first started with an introduction to ML/AI as a concept during my final year B.Tech project. The possibilities of solving real world challenges so intuitively with this technology pushed me to explore NLP further in depth. Hence, after my bachelor’s, I decided to work as a Data Scientist for a startup in Bangalore. And then rest is history – as I completely fell in love with this field.
Three years of industry exposure made me realize that there is so much more to this field. I was amazed to witness the exponential growth and cutting edge research involving NLP. As I am proceeding through this journey of exploring this field, I see a lot of real world challenges that can be seamlessly solved using NLP. Some examples that are on top of my mind are Search Engines, Chatbots, Voice Assistants, Language Translation, Fake News Detection, Email Smart Reply, etc.
Q2. You are doing MS in Data Science by Research. What kind of research are you doing and what are your research interests?
Yes, I am currently in my 2nd year of MSR program from International Institute of Information Technology, Bangalore (IIIT-B). I am working majorly in the domain of NLP and my thesis project is about automatically generating a short preview video summarizing given set of text resources. Apart from generative NLP applications such as summarization, translation, etc. My research interests also include working on unsupervised learning, adversarial training and conversational AI.
Q3. Which type of Machine Learning is highly used to solve real world problems between Supervised and Unsupervised?
With the recent explosion of active research and open-sourced pre-trained models, the boundaries between solving real life problems via supervised and unsupervised models have blurred. The possibility of leveraging both or one of Supervised/Unsupervised learning techniques have opened up massively. However, having said that, I would like to highlight certain parameters which influence the decision of choosing the relevant learning technique. Access to clean, well-labeled data sets usually determine if one can go for a purely supervised/unsupervised model or needs to find a middle path of doing unsupervised training followed by supervised training using active learning paradigm.
Q4. You are running a YouTube Channel of Data Science. What was your main motto behind this online venture? Please share some brief about it?
In the last few years, research in NLP domain has boomed. I realised that NLP practitioners and enthusiasts seldom refer to these research papers due to the complexities involved. Hence, I started this YouTube channel to demystify the cutting edge research going on in this domain also I strongly believe that “The best part of learning is sharing what you know”. This was the major motivation that pushed me to kickstart this passion project of mine and now it’s been around 5 months since i started my Youtube Channel, TechViz – The Data Science Guy.
Q5. What is importance of Data in any Data Science Project? Is clean data helps to solve problem fast and save time?
To put it simply, What Data is to Data Science is same as What Oxygen is to life. Data Science is all about finding existing hidden patterns in data. With no data the whole purpose gets defeated. Indeed, clean data is one of the factors that can accelerate the problem solving process and help us come up with relevant insights. How much time it saves will depend on ones skills to make use of the data, choosing models wisely, experimentation throughputs and so on.
Q6. You are a Blogger of Data Science Domain. How do you pick any topic for your Blog?
My major motivation to start documenting a variety of topics in a blog was to take this magical domain of ML, AI, Data Science to multiple audiences like students, enthusiasts, practitioners, etc. While there is no rule of thumb I follow to choose a topic, I usually try to create content aligned to the needs of my audience. From short blog posts around key concepts to elaborate once explaining an algorithm or library package, I prioritize these posts based on popularity, recency, fundamentals, etc.
Q7. Which Machine Learning Model have you used maximum in your research?
Given the problem statements I am currently working on, I can mostly be seen around playing with sequence models and transformers.
Read more about Prakhar Mishra @
LinkedIn – https://linkedin.com/in/prakhar21