The study aimed to extract online comments of otolaryngologists in the 20 most populated cities in the United States from healthgrades.com, develop and validate a natural language processing (NLP) logistic regression algorithm for automated text classification of reviews into 10 categories, and compare 1- and 5-star reviews in directly-physician-related and non-physician-related categories.
1977 1-star and 12,682 5-star reviews were collected. The primary investigator manually categorized a training dataset of 324 1-star and 909 5-star reviews, while a validation subset of 100 5-star and 50 1-star reviews underwent dual manual categorization. Using scikit-learn, an NLP algorithm was trained and validated on the subsets, with F1 scores evaluating text classification accuracy against manual categorization. The algorithm was then applied to the entire dataset with comparison of review categorization according to 1- and 5-star reviews.
F1 scores for NLP validation ranged between 0.71 and 0.97. Significant associations emerged between 1-star reviews and treatment plan, accessibility, wait time, office scheduling, billing, and facilities. 5-star reviews were associated with surgery/procedure, bedside manner, and staff/mid-levels.
The study successfully validated an NLP text classification system for categorizing online physician reviews. Positive reviews were found to have an association with directly-physician related context. 1-star reviews were related to treatment plan, accessibility, wait time, office scheduling, billing, and facilities. This method of text classification effectively discerned the nuances of human-written text, providing valuable insights into online healthcare feedback that is scalable.
Level of evidence: Level 3