Anurag Roy

pursuing PhD in Multimodal Machine Learning

CSE, IIT Kharagpur


I graduated from the Computer Science and Technology department(with Hons.) of IIEST Shibpur in the year 2017. I worked as a software engineer at Polaris Networks for a year(July 2017 - Jul 2018). Currently I am a Ph.D. scholar at IIT Kharagpur under the supervision of Prof. Saptarshi Ghosh of Computer Science and Engineering Department. I am also a member of the Complex Networks Research Group(CNERG)


  • Machine Learning
  • Multimodal Learning
  • Information Retrieval


  • PhD in Computer Science and Engineering, 2019

    Indian Institute of Technology Kharagpur

  • B.E.(Hons) in Computer Science and Technology, 2017

    Indian Institute of Engineering Science and Technology, Howrah, West Bengal

  • CBSE XII Science with Computer Application, 2013

    Army Public School, Ballygunge, Kolkata, West Bengal


Programming Skills









Senior Research Fellow

CNeRG Lab, CSE Department, IIT Kharagpur

Jul 2020 – Present Kharagpur, West Bengal

Applied Scientist Intern

India ML Team, Amazon Development Center India Pvt. Ltd.

May 2019 – Jul 2019 Bengaluru, Karnataka
– Predicted Product Trust Score – Developed machine learning models to Predict Trust Score

Ph.D. Research Scholar

CNeRG Lab, CSE Department, IIT Kharagpur

Jan 2019 – Present Kharagpur, West Bengal
Under the Supervision of Prof. Saptarshi Ghosh

Junior Research Fellow

CNeRG Lab, CSE Department, IIT Kharagpur

Jul 2018 – Jul 2020 Kharagpur, West Bengal

Software Developer

Polaris Networks

Jul 2017 – Jul 2018 San Jose, California

Responsibilities include:

  • Development of 5g core network stack
  • Development of MCPTT tester

Research Intern

IIT Kanpur

May 2017 – Aug 2017 Kanpur, Lucknow


Responsibilities include:

  • Understaning the different kind of morphological variations existing over different languages.
  • Developing of an un-supervised clustering algorithm to capture the morphological variants of a word.

Research Intern

IIEST Shibpur

Dec 2016 – Jan 2017 Howrah, West Bengal
Developed using python and scikit-learn a program which identified rumor tweets in real-time by learning from previous data.

Research Intern

CNeRG Lab, CSE Department, IIT Kharagpur

May 2016 – Jul 2016 Kharagpur, West Bengal


Responsibilities include:

  • Developin an on-line survey application in flask and jinja.
  • Using machine learning models to evaluate accuracy

Undergraduate Student

IIEST Shibpur

Aug 2013 – May 2017 Howrah, West Bengal
BE in Computer Science and Technology

Recent Publications

ZSCRGAN: A GAN-based Expectation Maximization Model for Zero-Shot Retrieval of Images from Textual Descriptions

Most existing algorithms for cross-modal Information Retrieval are based on a supervised train-test setup, where a model learns to …

Distributed Representation of Tags for Active Zero Shot Learning [short paper]

Extreme multi-labeled classification (XMLC) refers to the problem of tagging items to its most relevant subset of class labels from an …

An Unsupervised Normalization Algorithm for Noisy Text: A Case Study for Information Retrieval and Stance Detection

A large fraction of textual data available today contains various types of ‘noise’, such as OCR noise in digitized …

Retrieving Information from Multiple Sources [poster]

The Web has several information sources on which an ongoing event is discussed. To get a complete picture of the event, it is important …

Combining Local and Global Word Embeddings for Microblog Stemming [short paper]

Stemming is a vital step employed to improve retrieval performance through efficient unification of morphological variants of a word. …

Recent Posts


Summary While Existing methods on few-shot image segmentation focus on 1-way segmentation, this paper focuses on k-way segmentation tasks. Existing Few-shot learning algorithms suffer from: Distribution Divergence: Most existing methods require to be pre-trained on ImageNet.

From CPP to Java

Some differences between c++ and java: Java compiled code is platform independent whereas c++ compiled code is platform dependent Java interpreter reports the run-time error that caused the execution to halt unlike in c/c++ programs which may simply crash

Stats 101

Sampling Theory and Distributions

Sampling Theory Data scientists are required to draw conclusions about a group, a.k.a population from a few samples of it because getting the entire population is intractable. This process of drawing samples is called sampling.

How to Install Packages Locally in Linux

Some Prerequisites What is a variable? A variable is a storage location for a value. Linux has environment variables. It can store strings, numbers , etc. just like the variables in C, C++, python, or any other programming language.


  • 7278389228
  • Dept. of Computer Science and Engineering, IIT Kharagpur, Kharagpur, West Bengal 721302