Special Issue Article

Targeting Item-level Nuances Leads to Small but Robust Improvements in Personality Prediction from Digital Footprints

Corresponding Author

Andrew N. Hall

Correspondence to: Andrew N. Hall, Department of Psychology, Northwestern University, Swift Hall 102, 2029 Sheridan Road, Evanston, IL 60208, USA. E-mail: ahall4488@gmail.com; andrewhall@u.northwestern.edu

Search for more papers by this author

Sandra C. Matz,

Sandra C. Matz

Columbia Business School, Columbia University, New York City, NY, USA

Department of Psychology, Northwestern University, Evanston, IL, USA

Search for more papers by this author

Andrew N. Hall,

Corresponding Author

Andrew N. Hall

Search for more papers by this author

Sandra C. Matz,

Sandra C. Matz

Columbia Business School, Columbia University, New York City, NY, USA

Department of Psychology, Northwestern University, Evanston, IL, USA

Search for more papers by this author

First published: 15 April 2020

https://doi.org/10.1002/per.2253

Citations: 9

Share a link

Email
Facebook
Twitter
LinkedIn
Reddit
Wechat

Abstract

In the past decade, researchers have demonstrated that personality can be accurately predicted from digital footprint data, including Facebook likes, tweets, blog posts, pictures, and transaction records. Such computer-based predictions from digital footprints can complement—and in some circumstances even replace—traditional self-report measures, which suffer from well-known response biases and are difficult to scale. However, these previous studies have focused on the prediction of aggregate trait scores (i.e. a person's extroversion score), which may obscure prediction-relevant information at theoretical levels of the personality hierarchy beneath the Big 5 traits. Specifically, new research has demonstrated that personality may be better represented by so-called personality nuances—item-level representations of personality—and that utilizing these nuances can improve predictive performance. The present work examines the hypothesis that personality predictions from digital footprint data can be improved by first predicting personality nuances and subsequently aggregating to scores, rather than predicting trait scores outright. To examine this hypothesis, we employed least absolute shrinkage and selection operator regression and random forest models to predict both items and traits using out-of-sample cross-validation. In nine out of 10 cases across the two modelling approaches, nuance-based models improved the prediction of personality over the trait-based approaches to a small, but meaningful degree (4.25% or 1.69% on average, depending on method). Implications for personality prediction and personality nuances are discussed. © 2020 European Association of Personality Psychology

Open Research

Open Research Badges

This article earned Open Materials badge through Open Practices Disclosure from the Center for Open Science: https://osf.io/tvyxz/wiki. The materials are permanently and openly accessible at https://osf.io/3rdju/. Author's disclosure form may also be found at the Supporting Information in the online version.

Supporting Information

Filename

Description

per2253-sup-0001-Supplementary Material.docxWord 2007 document , 996.5 KB

Figure S1. Difference in magnitude of Spearman correlations between predicted personality traits (nuance-model and trait-model) from Random Forest models and 11 outcomes. Higher values (blue in the above plot) indicate stronger correlation between item-level personality traits and the outcome than between aggregate-level personality traits and the outcome.

Figure S2. Spearman correlations of predicted outcome value scores between self-reported personality models and nuance vs. trait models. Predicted nuance and trait values come from Random Forest models. All predicted values are the result of 5-fold cross-validation using standard multiple regression to predict the outcome variable. Blue points indicate nuance-model predictions correlate more strongly with self-reported predictions, while red dots indicate trait-model correlate more strongly with self-reported predictions. A line with slope m = 1 is included for reference, as points on this line would indicate equal prediction, while points above indicate nuance-models outperform and points below indicate trait-models outperform.

Table S1. Raw Spearman correlations between predicted personality traits and observed external outcome values for both LASSO and Random Forrest results. Self-report column denotes the correlation of observed self-reported personality traits with outcomes.

Table S2. RMSE values calculated between the predicted traits scores and actual trait scores for the LASSO (left) and Random Forest (right) models. The final column displays the absolute change between the two model types.

per2253-sup-0002-Open_Practices_Disclosure_Form.pdfPDF document, 1 MB

Supporting info item

Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.

References

Aigner, D. J., & Goldfeld, S. M. (1974). Estimation and prediction from aggregate data when aggregates are measured more accurately than their components. Econometrica, 42, 113. https://doi.org/10.2307/1913689.
10.2307/1913689
Web of Science®Google Scholar
Azucar, D., Marengo, D., & Settanni, M. (2018). Predicting the Big 5 personality traits form digital footprints on social media: A meta-analysis. Personality and Individual Differences, 124, 150–159.
10.1016/j.paid.2017.12.018
Web of Science®Google Scholar
Bischl, B., Mersmann, O., Trautmann, H., & Weihs, C. (2012). Resampling methods for meta-model validation with recommendations for evolutionary computation. Evolutionary Computation, 20, 249–275. https://doi.org/10.1162/EVCO_a_00069.
10.1162/EVCO_a_00069
CASPubMedWeb of Science®Google Scholar
Bleidorn, W., & Hopwood, C. J. (2019). Using machine learning to advance personality assessment and theory. Personality and Social Psychology Review, 190–203. https://doi.org/10.1002/9781119173489.ch2.
10.1177/1088868318772990
PubMedWeb of Science®Google Scholar
Bleidorn, W., Hopwood, C. J., & Wright, A. G. (2017). Using big data to advance personality theory. Current Opinion in Behavioral Sciences, 18, 79–82. https://doi.org/10.1016/j.cobeha.2017.08.004.
10.1016/j.cobeha.2017.08.004
Web of Science®Google Scholar
Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.
10.1023/A:1010933404324
Web of Science®Google Scholar
Costa, P. T., & McCrae, R. R. (1985). The NEO Personality Inventory manual. Psychological Assessment Resources.
Google Scholar
Costa, P. T., & McCrae, R. R. (1995). Domains and facets: Hierarchical personality assessment using the revised NEO Personality Inventory. Journal of Personality Assessment, 64, 21–50. https://doi.org/10.1207/s15327752jpa6401_2.
10.1207/s15327752jpa6401_2
PubMedWeb of Science®Google Scholar
Diener, E., Emmons, R. A., Larsen, R. J., & Griffin, S. (1985). The Satisfaction With Life Scale. Journal of Personality Assessment, 49, 71–75. https://doi.org/10.1207/s15327752jpa4901_13.
10.1207/s15327752jpa4901_13
CASPubMedWeb of Science®Google Scholar
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33. https://doi.org/10.18637/jss.v033.i01.
10.18637/jss.v033.i01
PubMedWeb of Science®Google Scholar
Funder, D. C. (2016). The personality puzzle ( 7th ed.). Norton & Co: W. W.
Google Scholar
Gladstone, J. J., Matz, S. C., & Lemaire, A. (2019). Can psychological traits be inferred from spending? Evidence from transaction data. Psychological Science, 30, 1087–1096.
10.1177/0956797619849435
PubMedWeb of Science®Google Scholar
Goldberg, L. R. (1990). An alternative “description of personality”: The Big-Five factor structure. Journal of Personality and Social Psychology, 59, 1216–1229.
10.1037/0022-3514.59.6.1216
CASPubMedWeb of Science®Google Scholar
Goldberg, L. R. (1999). A broad-bandwidth, public domain, personality inventory measuring the lower-level facets of several five-factor models. In I. Mervielde, I. J. Deary, F. Fruyt, & F. Ostendorf (Eds.), Personality psychology in Europe (pp. 7–28), 7. Tilburg University Press.
Google Scholar
Goldberg, L. R., Johnson, J. A., Eber, H. W., Hogan, R., Ashton, M. C., Cloninger, C. R., & Gough, H. G. (2006). The international personality item pool and the future of public-domain personality measures. Journal of Research in Personality, 40, 84–96. https://doi.org/10.1016/j.jrp.2005.08.007.
10.1016/j.jrp.2005.08.007
Web of Science®Google Scholar
Hendry, D. F., & Hubrich, K. (2005). Forecasting aggregates by disaggregates. 35.
Google Scholar
Ishwaran, H., Kogalur, U. B., Blackstone, E. H., & Lauer, M. S. (2008). Random survival forests. The Annals of Applied Statistics, 2, 841–860. https://doi.org/10.1214/08-AOAS169.
10.1214/08-AOAS169
Web of Science®Google Scholar
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning with applications in R. Springer.
10.1007/978-1-4614-7138-7
Google Scholar
John, O. P., Naumann, L. P., & Soto, C. J. (2008). Paradigm shift to the integrative Big Five trait taxonomy: History, measurement, and conceptual issues. In O. P. John, & R. W. Robins (Eds.), Handbook of personality: Theory and research ( 2nd ed.). Guilford Press.
Web of Science®Google Scholar
John, O. P., & Srivastava, S. (1999). The big five trait taxonomy: History, measurement, and theoretical perspective. In L. Pervin, & O. P. John (Eds.), Handbook of personality: Theory and research ( 2nd ed.). Guilford Press.
Google Scholar
Kosinski, M., Matz, S. C., Gosling, S. D., Popov, V., & Stillwell, D. (2015). Facebook as a research tool for the social sciences: Opportunities, challenges, ethical considerations, and practical guidelines. American Psychologist, 70, 543–556. https://doi.org/10.1037/a0039210.
10.1037/a0039210
PubMedWeb of Science®Google Scholar
Kosinski, M., Stillwell, D., & Graepel, T. (2013). Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences, 110, 5802–5805. https://doi.org/10.1073/pnas.1218772110.
10.1073/pnas.1218772110
CASPubMedWeb of Science®Google Scholar
Kosinski, M., Wang, Y., Lakkaraju, H., & Leskovec, J. (2016). Mining big data to extract patterns and predict real-life outcomes. Psychological Methods, 21, 493–506. https://doi.org/10.1037/met0000105.
10.1037/met0000105
PubMedWeb of Science®Google Scholar
Kuhn, M. (2008). Building predictive models in R using the caret package. Journal of Statistical Software, 28, 1–26. https://doi.org/10.18637/jss.v028.i05.
10.18637/jss.v028.i05
PubMedWeb of Science®Google Scholar
Lazer, D., Pentland, A., Adamic, L., Aral, S., Barabasi, A.-L., Brewer, D., Christakis, N., … Van Alstyne, M. (2009). Computational social science. Science, 323, 721–723. https://doi.org/10.1126/science.1167742.
10.1126/science.1167742
CASPubMedWeb of Science®Google Scholar
Liaw, A., & Wiener, M. (2002). Classification and regression by randomForest. R news, 2, 5.
Google Scholar
Marcus, B., Machilek, F., & Schütz, A. (2006). Personality in cyberspace: Personal web sites as media for personality expressions and impressions. Journal of Personality and Social Psychology, 90, 1014–1031. https://doi.org/10.1037/0022-3514.90.6.1014.
10.1037/0022-3514.90.6.1014
PubMedWeb of Science®Google Scholar
Matz, S. C., & Netzer, O. (2017). Using Big Data as a window into consumers' psychology. Current Opinion in Behavioral Sciences, 18, 7–12. https://doi.org/10.1016/j.cobeha.2017.05.009.
10.1016/j.cobeha.2017.05.009
Web of Science®Google Scholar
McCrae, R. R. (2014). A more nuanced view of reliability: Specificity in the trait hierarchy. Personality and Social Psychology Review, 19, 17.
Web of Science®Google Scholar
Mõttus, R., Kandler, C., Bleidorn, W., Riemann, R., & McCrae, R. R. (2017). Personality traits below facets: The consensual validity, longitudinal stability, heritability, and utility of personality nuances. Journal of Personality and Social Psychology, 112, 474–490. https://doi.org/10.1037/pspp0000100.
10.1037/pspp0000100
PubMedWeb of Science®Google Scholar
Mõttus, R., Sinick, J., Terracciano, A., Hřebíčková, M., Kandler, C., Ando, J., Mortensen, E. L., … Jang, K. L. (2018). Personality characteristics below facets: A replication and meta-analysis of cross-rater agreement, rank-order stability, heritability, and utility of personality nuances. Journal of Personality and Social Psychology. https://doi.org/10.1037/pspp0000202.
10.1037/pspp0000202
PubMedWeb of Science®Google Scholar
Ozer, D. J., & Benet-Martínez, V. (2006). Personality and the prediction of consequential outcomes. Annual Review of Psychology, 57, 401–421. https://doi.org/10.1146/annurev.psych.57.102904.190127.
10.1146/annurev.psych.57.102904.190127
PubMedWeb of Science®Google Scholar
Park, G., Schwartz, H. A., Eichstaedt, J. C., Kern, M. L., Kosinski, M., Stillwell, D. J., Ungar, L. H., et al. (2015). Automatic personality assessment through social media language. Journal of Personality and Social Psychology, 108, 934–952. https://doi.org/10.1037/pspp0000020.
10.1037/pspp0000020
PubMedWeb of Science®Google Scholar
Revelle, W. (2019). psych: Procedures for personality and psychological research (Version 1.9.12) [R]. Northwestern University. https://CRAN.R-project.org/package=psych
Google Scholar
Schwartz, S. H. (1992). Universals in the content and structure of values: Theoretical advances and empirical tests in 20 countries. In Advances in experimental social psychology (Vol. 25, pp. 1–65). Elsevier. https://doi.org/10.1016/S0065-2601(08)60281-6
10.1016/S0065-2601(08)60281-6
CASWeb of Science®Google Scholar
Seeboth, A., & Mõttus, R. (2018). Successful explanations start with accurate descriptions: Questionnaire items as personality markers for more accurate predictions. European Journal of Personality, 32, 186–201. https://doi.org/10.1002/per.2147.
10.1002/per.2147
Web of Science®Google Scholar
Segalin, C., Celli, F., Polonio, L., Kosinski, M., Stillwell, D., Sebe, N., Cristani, M., et al. (2017). What your Facebook profile picture reveals about your personality. Proceedings of the 2017 ACM on Multimedia Conference – MM'17, 460–468. https://doi.org/10.1145/3123266.3123331.
10.1145/3123266.3123331
Google Scholar
Simonsohn, U., Simmons, J. P., & Nelson, L. D. (2015). Specification curve: Descriptive and inferential statistics on all reasonable specifications. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.2694998.
10.2139/ssrn.2694998
Google Scholar
Soto, C. J. (2019). How replicable are links between personality traits and consequential life outcomes? The Life Outcomes of Personality Replication Project. Psychological Science, 30, 711–727. https://doi.org/10.1177/0956797619831612.
10.1177/0956797619831612
PubMedWeb of Science®Google Scholar
Stachl, C., Au, Q., Schoedel, R., Buschek, D., Völkel, S., Schuwerk, T., Oldemeier, M., … Bühner, M. (2019). Behavioral patterns in smartphone usage predict Big Five personality traits. PsyArXiv. https://doi.org/10.31234/osf.io/ks4vd.
Google Scholar
Stachl, C., Pargent, F., Hilbert, S., Harari, G. M., Schoedel, R., Vaid, S., Gosling, S. D., et al. (2019). Personality research and assessment in the era of machine learning. PsyArXiv Preprints. https://doi.org/10.31234/osf.io/efnj8.
Google Scholar
Varma, S., & Simon, R. (2006). Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics, 7, 91 https://doi.org/10.1186/1471-2105-7-91.
10.1186/1471-2105-7-91
CASPubMedWeb of Science®Google Scholar
De Winter, J. C. F., Gosling, S. D., & Potter, J. (2016). Comparing the Pearson and Spearman correlation coefficients across distributions and sample sizes: A tutorial using simulations and empirical data. Psychological Methods, 21, 273–290. https://doi.org/10.1037/met0000079.
10.1037/met0000079
PubMedWeb of Science®Google Scholar
Wright, M., & Ziegler, A. (2017). Ranger: A fast implementation of random forests for high dimensional data in C++ and R. Journal of Statistical Software, 77, 1–17. https://doi.org/10.18637/jss.v077.i01.
10.18637/jss.v077.i01
Web of Science®Google Scholar
Yarkoni, T., & Westfall, J. (2017). Choosing prediction over explanation in psychology: Lessons from machine learning. Perspectives on Psychological Science, 12, 1100–1122. https://doi.org/10.1177/1745691617693393.
10.1177/1745691617693393
PubMedWeb of Science®Google Scholar
Youyou, W., Kosinski, M., & Stillwell, D. (2015). Computer-based personality judgments are more accurate than those made by humans. Proceedings of the National Academy of Sciences, 112, 1036–1040. https://doi.org/10.1073/pnas.1418680112.
10.1073/pnas.1418680112
CASPubMedWeb of Science®Google Scholar

Citing Literature

Volume34, Issue5

Special Issue:Behavioral personality science in the age of big data

September/October 2020

Pages 873-884

Targeting Item-level Nuances Leads to Small but Robust Improvements in Personality Prediction from Digital Footprints

Abstract

Open Research

Open Research Badges

Supporting Information

References

Citing Literature

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Targeting Item-level Nuances Leads to Small but Robust Improvements in Personality Prediction from Digital Footprints

Abstract

Open Research

Open Research Badges

Supporting Information

References

Citing Literature

References

Related

Information