All

Human Papillomavirus Vaccine Discourse and Sentiment on Reddit Before and After COVID-19: Mixed Methods Retrospective Cross-Sectional Study

Background: Human papillomavirus (HPV) is a sexually transmitted virus that causes various oropharyngeal and anogenital cancers. The HPV vaccine provides protection against several strains of HPV and is a key preventative tool against HPV-related cancers; however, vaccination rates remain suboptimal in the United States due to variable state mandates and misperceptions of vaccine efficacy and risks. As social media becomes an increasingly popular avenue for health discussions, platforms such as Reddit offer opportunities to understand public vaccine discourse, particularly among underrepresented groups. Furthermore, vaccine hesitancy and mistrust generally increased during the COVID-19 pandemic, potentially impacting HPV vaccination rates. Objective: This study aimed to characterize HPV vaccine–related discussions on Reddit by (1) identifying dominant themes, (2) assessing sentiment, and (3) examining shifts in themes and sentiment before and after the start of the COVID-19 pandemic. Methods: This convergent mixed methods analysis used a Python-based script to extract Reddit posts and comments referencing the HPV vaccine from multiple subreddits from September 13, 2016, to February 13, 2025. Entries were manually labeled as intentional or incidental, with intentional entries further categorized into 10 thematic categories. Sentiment analysis was conducted using the VADER (Valence Aware Dictionary for Sentiment Reasoning) algorithm. Pre– vs post–COVID-19 comparisons used March 11, 2020, as the cutoff date. Temporal trends were assessed using pre– and post–COVID-19 stratification. Chi-square tests, Mann-Whitney tests, and linear regression were used for statistical analysis. Results: Of 4235 collected entries, 2801 (66.1%) intentional posts and comments were analyzed. The most common themes were factual claims (703/2801, 25.1%), personal experiences (571/2801, 20.4%), and offering advice (465/2801, 16.6%). Overall sentiment was 51.7% (1448/2801) positive (95% CI 49.8%-53.5%), 37.6% (1053/2801) negative, and 10.8% (300/2801) neutral, with a median sentiment score of 0.13 (IQR –0.49 to 0.71). Pre–COVID-19 entries (n=278) were 124 (44.6%) positive (95% CI 38.7%-50.7%), with a median sentiment of 0.00 (–0.66 to 0.58). Post–COVID-19 entries (n=2523) were 1324 (52.4%) positive (95% CI 50.5%-54.4%), with a median sentiment of 0.17 (–0.46 to 0.72). Sentiment increased significantly post–COVID-19 (=.01). Theme distribution differed before vs after COVID-19 (²=114.47, <.001). Pre–COVID-19 discussions overrepresented barriers and social or cultural influences, whereas post–COVID-19 discussions more frequently reflected personal experiences, advice seeking, and advice offering. Posting volume increased by 50% per year throughout the study period (incidence rate ratio=1.50, 95% CI 1.47-1.53; <.001), with a steeper increase post–COVID-19 (incidence rate ratio=1.60, 95% CI 1.56-1.65). Conclusions: This study provides a post–COVID-19 perspective on patient-generated HPV vaccine discourse by integrating thematic content with sentiment and temporal trends. Prior studies have been limited by a narrower thematic scope and have focused on prepandemic periods. By examining discussions on an anonymous platform such as Reddit, this analysis captures personal conversations and assesses pandemic-related shifts in discourse. These findings add to the literature on digital vaccine communication and highlight opportunities for targeted public health engagement and strategies to leverage online communities.

Human Papillomavirus Vaccine Discourse and Sentiment on Reddit Before and After COVID-19: Mixed Methods Retrospective Cross-Sectional Study

Introduction
Human papillomavirus (HPV) is a sexually transmitted virus and a well-established risk factor for various cancers in both men and women [ ]. Most commonly associated with cervical cancer, HPV has also been increasingly linked to anogenit