Date of Project

4-5-2025

Document Type

Honors Thesis

School Name

College of Arts and Sciences

Department

Computer Science

Major Advisor

Dr. Sayani Sarkar

Second Advisor

Dr. Andrew Karem

Abstract

Adversarial attacks pose a significant threat to the reliability of machine learning-based spam detection systems in social media. This undergraduate thesis, "From Adversarial Attacks to Robust Classifiers: A Study in Social Media Spam Detection – Black Box & White Box," systematically examines the impact of both black-box and white-box adversarial attacks on a range of spam classifiers, including Logistic Regression, Decision Trees, Random Forests, K-Nearest Neighbors, Bagging, Gradient Boosting, and Support Vector Machines. Leveraging a novel dataset derived from Twitter spam messages and enhanced with adversarial perturbations such as synonym replacement and character-level modifications, this study evaluates classifier performance under realistic attack scenarios. Exploratory data analysis reveals how adversarial manipulation alters linguistic features and classification outcomes. Experimental results demonstrate that adversarial attacks can significantly degrade model accuracy, with white-box attacks generally proving more effective than black-box attacks. The thesis further discusses defense strategies and model robustness, offering practical insights for developing more resilient spam detection systems in the face of evolving adversarial threats. This work contributes to the fields of natural language processing, cybersecurity, and social media analytics by highlighting the urgent need for robust classifiers capable of withstanding adversarial manipulation.

Available for download on Thursday, April 30, 2026

Share

COinS