Volume 34, Issue 5 p. 826-844
Special Issue Article
Open DataOpen Material

Psychometric and Validity Issues in Machine Learning Approaches to Personality Assessment: A Focus on Social Media Text Mining

Louis Tay

Corresponding Author

Louis Tay

Department of Psychological Sciences, Purdue University, West Lafayette, IN, USA

Correspondence to: Louis Tay, Department of Psychological Sciences, Purdue University, 703 Third Street, West Lafayette, IN 47907, USA.

E-mail: stay@purdue.edu

Search for more papers by this author
Sang Eun Woo

Sang Eun Woo

Department of Psychological Sciences, Purdue University, West Lafayette, IN, USA

Search for more papers by this author
Louis Hickman

Louis Hickman

Department of Psychological Sciences, Purdue University, West Lafayette, IN, USA

Search for more papers by this author
Rachel M. Saef

Rachel M. Saef

Northern Illinois University, DeKalb, IL, USA

Search for more papers by this author
First published: 16 July 2020
Citations: 9

Louis Tay and Sang Eun Woo contributed equally to the paper.

Abstract

In the age of big data, substantial research is now moving toward using digital footprints like social media text data to assess personality. Nevertheless, there are concerns and questions regarding the psychometric and validity evidence of such approaches. We seek to address this issue by focusing on social media text data and (i) conducting a review of psychometric validation efforts in social media text mining (SMTM) for personality assessment and discussing additional work that needs to be done; (ii) considering additional validity issues from the standpoint of reference (i.e. ‘ground truth’) and causality (i.e. how personality determines variations in scores derived from SMTM); and (iii) discussing the unique issues of generalizability when validating SMTM for personality assessment across different social media platforms and populations. In doing so, we explicate the key validity and validation issues that need to be considered as a field to advance SMTM for personality assessment, and, more generally, machine learning personality assessment methods. © 2020 European Association of Personality Psychology

Open Research Badges

Open DataOpen Material

This article earned Open Data and Open Materials badges through Open Practices Disclosure from the Center for Open Science: https://osf.io/tvyxz/wiki. The data and materials are permanently and openly accessible at https://osf.io/cgpmz/?view_only=4a56e3fb9aa6476bb2b9b27273b4124d. Author's disclosure form may also be found at the Supporting Information in the online version.

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.