Fake News Detection on Reddit Utilising CountVectorizer and Term Frequency-Inverse Document Frequency with Logistic Regression, MultinominalNB and Support Vector Machine

Ankitkumar Patel, Kevin Meehan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

12 Citations (Scopus)

Abstract

The distribution of misleading information or fake news has become a problem for society in recent times. In the world of social media, where anyone can share their opinions, beliefs and make it sound like these are fact, fake news becomes a threat to the reputation of companies and to people. In 2016, the USA Presidential elections gathered more attention from the generation of fake news articles, leading to a huge number of researchers and scientists to explore this Natural Language Processing research area with a sense of urgency and keen interest. However, investigation regarding what people are consuming from social media is in early stages and efforts are in progress to explore how people can separate disinformation from truthful content. The primary challenge in fake news detection is determining how to detect it. Supervised learning methods help us to detect these stories using labelled data to determine if text is real or fake. This research aims to develop and compare supervised learning models using Logistic Regression, MultinominalNB, and Support Vector Machine with CountVectorizer and Term Frequency -Inverse Document Frequency methods on Reddit data. The research concludes that the CountVectorizer and MultinominalNB model achieved highest accuracy on the Reddit dataset.

Original languageEnglish
Title of host publication2021 32nd Irish Signals and Systems Conference, ISSC 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665434294
DOIs
Publication statusPublished - 10 Jun 2021
Event32nd Irish Signals and Systems Conference, ISSC 2021 - Athlone, Ireland
Duration: 10 Jun 202111 Jun 2021

Publication series

Name2021 32nd Irish Signals and Systems Conference, ISSC 2021

Conference

Conference32nd Irish Signals and Systems Conference, ISSC 2021
Country/TerritoryIreland
CityAthlone
Period10/06/2111/06/21

Keywords

  • CountVectorizer
  • Fake news detection
  • Logistic Regression
  • MultinominalNB
  • Supervised Learning Methods
  • Support Vector Machine

Fingerprint

Dive into the research topics of 'Fake News Detection on Reddit Utilising CountVectorizer and Term Frequency-Inverse Document Frequency with Logistic Regression, MultinominalNB and Support Vector Machine'. Together they form a unique fingerprint.

Cite this