Are we there yet? Thematic analysis, NLP, and machine learning for research

Fitkov-Norris, Elena and Kocheva, Nataliya (2023) Are we there yet? Thematic analysis, NLP, and machine learning for research. In: 22nd European Conference on Research Methodology for Business and Management Studies (ECRM 2023); 6 Sep 2023, Lisboa, Portugal.

Official URL: https://doi.org/10.34190/ecrm.22.1.1616

Abstract

Thematic analysis is a well-established technique for qualitative analysis which is covered in traditional research methods training. The objective of thematic analysis is to elicit themes and significant topics from discursive data such as free style discussions and semi structured or unstructured interviews or comments. The approach is laborious and time consuming and requires a significant input from researchers for identifying and coding the themes although software tools such as NVivo, T-Lab and IRaMuTeQ can aid with results presentation. Recent developments in Machine Learning (ML) and Natural Language Processing (NLP) have boosted interest in text analytics and its applications to social science research. For example, automatic topic identification using ML NLP offers valuable insights in social media analytics. However, machine learning techniques conventionally rely on large data sets to enable the algorithm to elicit themes. More recent research efforts have turned to the performance of machine learning approaches with smaller data sets. This study aims to compare and contrast the effectiveness of Machine Learning NLP vs human generated themes using the text analytics tools NVivo, T-Lab IRaMuTeQ, as well as the low-code ML tool KNIME for automatically eliciting themes from academic literature review in the contexts of service operations management research and semi-structured customer interviews. Results indicate that the ML NLP approach has the potential to automatically detect research themes even with small data sets, although the results vary across the different tools and are dependent on the capabilities of the built-in text analytic algorithms. In particular, T-Lab offered the best mapping of machine learning derived topics to researcher themes, and KNIME proved the most robust software, able to derive meaningful topics even with very small sample sizes. The implications for training research students are also significant as they suggest that the inclusion of ML NLP tools and algorithms in the training curriculum of social scientists may be beneficial.

Official URL:	https://doi.org/10.34190/ecrm.22.1.1616
Item Type:	Conference or Workshop Item (Paper)
Event Title:	22nd European Conference on Research Methodology for Business and Management Studies (ECRM 2023)
Organising Body:	ECRM
Additional Information:	Published in: (2023) Prof Florinda Matos and Prof Álvaro Rosa (editors), Proceedings of the 22nd European Conference on Research Methodology for Business and Management Studies, ECRM2023, Academic Conferences International Limited, Reading, ISBN 9781914587719, ISSN 2049-0968, pp. 93-102
Research Area:	Research Areas > Business and management studies Research Areas > Computer science and informatics Domains Case Studies > Research Methods
Faculty, School or Research Centre:	Kingston Business School Kingston Business School > Department of Accountancy, Finance and Informatics
Date Deposited:	24 Jul 2023 09:25
Last Modified:	12 Sep 2023 15:44
DOI:	https://doi.org/10.34190/ecrm.22.1.1616
URI:	https://eprints.kingston.ac.uk/id/eprint/54206

Actions (Repository Editors)

Item Control Page