Sentiment Analysis

Sentiment analysis, also referred to as opinion mining, is an approach to natural language processing (NLP) that identifies the emotional tone behind a body of text. This is a popular way for organizations to determine and \categorize opinions about a product, service, or idea. It involves the use of data mining, machine learning (ML) and artificial intelligence (AI) to mine text for sentiment and subjective information.Sentiment analysis systems help organizations gather insights from unorganized and unstructured text that comes from online sources such as emails, blog posts, support tickets, web chats, social media channels, forums and comments. Algorithms replace manual data processing by implementing rule-based, automatic or hybrid methods. Rule-based systems perform sentiment analysis based on predefined, lexicon-based rules while automatic systems learn from data with machine learning techniques. A hybrid sentiment analysis combines both approaches.


• To trail the growth of panic amongst Twitter® users based on a specific keyword
• To analyze the sentiments of Indians post lockdown imposed by the government
• To extract an exact idea by detecting the primary topics tweeted by netizens related to COVID-19 pandemic.
• To study how the Chinese Weibo users were affected emotionally on and after 20th January, 2020.

Search Tweets Here


Task description

This competition is for sentiment analysis of Tweets related to the Covid-19 pandemic, which is a multi-label text classification task. Since the outbreak of coronavirus, it has affected more than 180 countries where massive losses in the economy and jobs globally and confining about 58% of the global population are caused. The research on people’s feelings is essential for keeping mental health and informed about Covid-19. In this competition, the released training data contains 1.6 million labeled tweets while the released validation data have 10,000 pieces of unlabeled tweets. The training data have 6 columns, containing Target, Tweet ID, Date, Flag, User and Text. Note that the orders are shown as , Negative(0), Neutral(2), Positive (4). For example, if the labels is 0, it means that this piece of the tweet is labeled as Neutral . The public ranking of the competition will be based on the prediction on the validation dataset. The final ranking will be based on a private testing dataset that has the same distribution as the training and validation dataset.

• Python Code-> Twitter API->Login to twitter account using generated key.
• Topic selection ->Time selection->Extraction of related tweets in a CSV file(testing dataset).
• Data cleansing of testing dataset->Cleaned dataset(testing).
• Neural network model training using training dataset.
• Generating the label of sentiments of the extracted tweets using the model.
• Checking of classification accuracy










Future Scope

We can further improve our progress by making it real-time... i.e when someone will tweet something, it will go through our model and will automatically get the label context immediately on the Twitter app itself.



Team Outliers
E-mail - outliers2k22@gmail.com