Binary Classification: An Introductory Machine Learning Tutorial for Social Scientists

Journal of methods and measurement in the social sciences Pub Date : 2021-12-12 DOI:10.2458/jmmss.5186

Vivian P. Ta, Leonardo Carrico, Arthur Bousquet

引用次数: 0

Abstract

A barrier that prevents many social scientists from pursuing big data research is the lack of technical training required to assemble and organize big data. In an effort to address this barrier, we provide an introductory tutorial into machine learning for social scientists by demonstrating the basic steps and fundamental concepts involved in binary classification. We first describe the data and libraries required for analysis. We then demonstrate data cleaning methods, feature engineering, the model-building process, model assessment, and feature importance. Last, we discuss the ways in which social scientists can use machine learning to complement inference-based approaches and how it can contribute to a richer understanding of social science.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

二元分类:面向社会科学家的机器学习入门教程

阻碍许多社会科学家从事大数据研究的一个障碍是缺乏收集和组织大数据所需的技术培训。为了解决这一障碍，我们通过演示二进制分类中涉及的基本步骤和基本概念，为社会科学家提供了机器学习的入门教程。我们首先描述分析所需的数据和库。然后，我们展示了数据清理方法、特征工程、模型构建过程、模型评估和特征重要性。最后，我们讨论了社会科学家如何使用机器学习来补充基于推理的方法，以及它如何有助于更丰富地理解社会科学。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of methods and measurement in the social sciences

自引率

0.00%

发文量

审稿时长

26 weeks