Received: Noviembre 30, 2023
Accepted: Marzo 08, 2024
Available: Abril 24, 2024
Computational Thinking (CT) is considered a key literacy skill in the digital age. It encompasses problem-solving, mathematical thinking, critical thinking, creativity, and communication. Since research on CT evaluation is in a consolidation phase, there is still a lack of systematic grouping of assessment instruments across different educational levels. This review aimed to identify the instruments used to measure CT, the evaluated skills, and the psychometric properties of these instruments. For such purpose, a systematic review of 52 articles published between 2012 and 2022 was conducted. The results revealed a significant growth in publications on the design and validation of CT measurement instruments in recent years. Over 80 % of the instruments demonstrated validity and reliability, particularly in terms of content validity, construct validity, and internal consistency. Furthermore, some instruments also evaluated affective and social skills, as well as attitudes, which enhanced the assessment of cognitive skills. However, the absence of contributions from Central and South American countries in the analyzed literature, along with the scarcity of instruments aimed at early childhood and teachers, highlights the need for further research into CT assessment in specific populations.
Keywords: Computational thinking, assessment instruments, psychometric properties, thinking skills, statistical methods.
El pensamiento computacional (PC) es una nueva forma de alfabetización y se considera como una competencia clave para los ciudadanos de la era actual. Es un constructo compuesto que tiene relación con la resolución de problemas, el pensamiento matemático, el pensamiento crítico, la creatividad y la comunicación. La investigación sobre la evaluación del PC se encuentra en consolidación, sin embargo, se evidencia ausencia de agrupación sistemática de instrumentos de medición del PC en diferentes niveles educativos. El objetivo de esta revisión consistió en identificar los instrumentos usados como herramientas para medir el PC, las habilidades evaluadas y las propiedades psicométricas de los instrumentos. Esta revisión sistemática presentó el análisis de 52 artículos encontrados del 2012 al 2022. Los resultados de la revisión demostraron un crecimiento significativo en las publicaciones relacionadas con el diseño y la validación de instrumentos de medición del PC en los últimos años. Se encontró que más del 80 % de los instrumentos presentaron evidencia de validez y confiabilidad, destacando la validez de contenido, la validez de constructo y la consistencia interna. Así mismo, en algunos instrumentos se consideraron la evaluación de habilidades afectivas, sociales y actitudes, lo cual enriquecía la valoración de las habilidades cognitivas. Sin embargo, se evidenció la ausencia de los países de Centro y Sur América en los artículos analizados sobre esta temática, al igual que la escasez de instrumentos dirigidos a la primera infancia y a los docentes. Estos hallazgos resaltan la necesidad de continuar investigando el PC desde la perspectiva de la evaluación en poblaciones específicas.
Palabras clave: Pensamiento computacional, instrumentos de evaluación, propiedades psicométricas, habilidades de pensamiento, métodos estadísticos.
In recent decades, the scientific and educational communities have stressed the importance of incorporating Computational Thinking (CT) into curricula across all levels of education. Nonetheless, given its status as an emerging field, there is still no consensus on its definition and practical application. Consequently, a variety of approaches have been adopted to integrate it in students’ learning process. This lack of a standardized definition for CT makes it challenging to design methods and tools for its evaluation [
From a conceptual standpoint, various definitions have been put forth for CT. For instance, [
Several initiatives have been developed to integrate CT into curricula, as well as tools for its accurate and reliable assessment. These tools include questionnaires [
This systematic review was motivated by the need to measure CT and identify the instruments that have been designed for its assessment. Specifically, the goal is to analyze a set of bibliometric indicators and variables of interest, such as the type of instrument, number of items, target population, sample size, evidence of pilot testing, identification of skills/competencies, theoretical foundations, and psychometric properties.
1.1 Literature review
An examination of previous studies on the use of CT assessment tools provided valuable insights into the current landscape of this field. The search yielded 15 reviews, classified into different categories: scoping reviews [
The retrieved review articles contributed to shedding light on research related to CT assessment. For instance, a scoping review of CT assessments in higher education [
In another scoping review of empirical research on recent CT assessments [
In their systematic review [
In the identified scoping reviews, mapping reviews, and meta-analyses, Scopus, ScienceDirect, ERIC, and Web of Science (WOS) were the most frequently consulted sources. As for the target population, Figure 1 shows that the most commonly selected population was students in K-12 educational settings. No reviews targeting teachers were found in the analysis.
Out of the 15 reviews, the one conducted by [
Additionally, Table 1 lists the titles of the review articles, along with the country where the research was conducted.
| # | Title | Country |
| 1 | A scoping review of computational thinking assessments in higher education [ |
Canada |
| 2 | A Scoping Review of Empirical Research on Recent Computational Thinking Assessments [ |
Canada |
| 3 | Approaches to Assess Computational Thinking Competences Based on Code Analysis in K-12 Education: A Systematic Mapping Study [ |
Brazil |
| 4 | Assessing computational thinking: A systematic review of empirical studies [ |
USA |
| 5 | Computational thinking and academic achievement: A meta-analysis among students [ |
China |
| 6 | Computational thinking learning experiences, outcomes, and research in preschool settings: a scoping review of literature [ |
USA |
| 7 | Computational Thinking Through an Empirical Lens: A Systematic Review of Literature [ |
China |
| 8 | Computational thinking in primary education: a systematic literature review [ |
Italy |
| 9 | How to Develop Computational Thinking: A Systematic Review of Empirical Studies [ |
Türkiye |
| 10 | Mapping Computational Thinking through Programming in K-12 Education: A Conceptual Model based on a Systematic Literature Review [ |
Greece |
| 11 | Preschool children, robots, and computational thinking: A systematic review [ |
USA Uruguay |
| 12 | Trends and development in research on computational thinking [ |
Türkiye |
| 13 | Unleashing the Potential of Abstraction From Cloud of Computational Thinking: A Systematic Review of Literature [ |
China |
| 14 | Which way of design programming activities is more effective to promote K-12 students' computational thinking skills? A meta-analysis [ |
China |
| 15 | An investigation of the data collection instruments developed to measure computational thinking [ |
Türkiye |
This systematic review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [
2.1 Research questions, objectives, and variables
The following research questions were proposed for this systematic review:
All these questions serve to identify the tools that have been used for CT assessment, their psychometric properties, and the evaluated skills. The proposed variables were divided into two categories: (i) bibliometric indicators, encompassing title, source of information, publication year, country, language, authors, journal, quartile, and the Scientific Journal Rank (SJR) and Journal Citation Reports (JCR) indices; and (ii) variables of interest, including type of instrument, number of items, age of target population, evaluated skills/competencies, theoretical foundations, authors, sample size, pilot testing, and method for determining instrument validity and reliability.
2.2 Literature search
To conduct the search, the following eight search strings were formulated, incorporating key concepts such as computational thinking, measurement, and instruments, while adhering to the syntax required by the employed databases:
"Pensamiento Computacional" AND medición
"Computational Thinking" AND measuring
"Computational thinking" + "measuring instruments"
"Computational thinking" + "measurement"
"Computational thinking" + "measure instruments"
"Computational thinking" + "measurement tool"
"Computational thinking" AND ("measur* instruments" OR "measur* tool*")
"Computational thinking" AND (“assess” OR “validity” OR “reliability” OR “test” OR “scale”)
The search spanned from 2012 to 2022 because, as indicated by [
Regarding exclusion and inclusion criteria, only research articles and reviews were considered for analysis, while publications in book formats, posters, conference proceedings, or articles that did not employ a specific instrument for measuring CT were excluded.
2.3 Identified articles
Initially, the search yielded 439 articles. After removing duplicates, 204 articles remained. Following further screening for relevance, 115 articles were retained. Finally, by applying exclusion criteria, a total of 52 articles were selected for the systematic review. Figure 3 provides a summary of the articles identified at each stage of the search process, which was conducted following the PRISMA statement.
3.1 Analysis of bibliometric indicators
3.1.1 Consulted databases
The majority of the articles (approximately 63.4 %) were found in Scopus. Figure 4 shows the number of articles retrieved from each consulted database, with several appearing in multiple sources.
3.1.2 Title keywords
According to the analysis, computational thinking was the most prevalent term in the titles of the examined articles, often accompanied by valid, scale, evaluate, and test, all of which allude to important features of the measurement instruments.
3.1.3 Publication year
As mentioned earlier, the search spanned from 2012 to 2022. Remarkably, none of the five sources yielded publications related to instrument construction before 2017. Figure 5 depicts the increase in the number of publications dedicated to CT measurement instruments throughout the analyzed period.
3.1.4 Country where the research was conducted
Türkiye was the country with the highest number of articles—ten in total—followed by China and the United States, with eight and seven articles, respectively. Only one study was carried out in Latin America, specifically in Venezuela. Figura 6 displays the distribution of articles by country.
3.1.5 Language
English was the most prevalent language, with 90 % of the articles being written in this language. Spanish accounted for 4 %, Turkish 4 %, and Japanese 2 %.
3.1.6 Authors
Table 2 presents the most prominent authors based on the number of published articles.
| Author | Published articles |
| Yan Li [ |
2 |
| Juan Carlos Pérez González [ |
2 |
| Jungwon Cho [ |
2 |
| Saralah Sovey and Mohd Effendi [ |
2 |
| Siu Cheung Kong [ |
2 |
| Barbara Bruno, Laila El-Hamamsy, and Estefanía Martín-Barroso [ |
2 |
| Kamisah Osman [ |
3 |
| Özgen Korkmaz [ |
3 |
| Jessica Dehler Zufferey [ |
3 |
| Marcos Román González [ |
4 |
Also, one important aspect considered in the analysis was the most cited authors in the analyzed articles (see Figure 7). Prominent authors include Brennan and Resnick, Selby and Woollard, the International Society for Technology in Education (ISTE), the Computer Science Teacher Association (CSTA), Román et al., and Korkmaz et al. The latter authors are notable references because the instruments they proposed—the Computational Thinking Test (CTt) and the Computational Thinking Scale (CTS)—are frequently employed for CT measurement.
3.1.7 Journal, quartile, and JCR indices
In the analysis, two impact indicators evaluating the excellence of published content were employed. JCR, on the one hand, primarily focuses on citation counts, providing the impact factor and quartile of a journal. SJR, on the other hand, considers the quality, relative importance of citations, and quartile of a journal. According to the findings, the Journal of Educational Computing Research and Education and Information Technologies stood out as the most productive journals, with six and four publications, respectively. Regarding the two indices and quartiles, 21 % of the journals had no classification in any of the indices. Information on each journal is provided in Table 3. The most prominent journals, ranked by the number of CT-related publications, are listed in [
| Journal | Number of articles | Impact factor | JCR quartile | SJR indicator | SJR quartile |
| Journal of Educational Computing Research | 6 | 0.14 | Q1 | 1.28 | Q1 |
| Education and Information Technologies | 4 | 0.23 | Q1 | 1.06 | Q1 |
| Computer Science Education | 2 | 0.29 | Q2 | 1 | Q1 |
| Computers in Human Behavior | 2 | 0.033 | Q1 | 2.17 | Q1 |
| European Journal of Educational Research | 2 | No | No | 0.31 | Q3 |
| Frontiers in Psychiatry | 2 | No | No | 1.28 | Q1 |
| AERA Open | 1 | 0.26 | Q2 | 0.86 | Q1 |
| British Journal of Educational Technology | 1 | 0.019 | Q1 | 1.87 | Q1 |
| Computers & Education | 1 | 0.054 | Q1 | 3.68 | Q1 |
| Computers in Education | 1 | 0.094 | Q1 | 1.04 | Q1 |
| Computers in the Schools | 1 | 0.38 | Q2 | 0.92 | Q1 |
| Current Psychology | 1 | 0.41 | Q2 | 0.51 | Q2 |
| Revista Digital del Doctorado en Educación de la Universidad Central de Venezuela | 1 | No | No | No | No |
| Espacios | 1 | No | No | 0 | No |
| Hipotenusa: Journal of Mathematical Society | 1 | No | No | No | No |
| Open Conference on Computers in Education | 1 | No | No | No | No |
| Informatics in Education | 1 | 0.22 | Q1 | 0.96 | Q1 |
| Interactive Learning Environments | 1 | 0.096 | Q1 | 1.17 | Q1 |
| International Journal of Advanced Computer Science and Applications (IJACSA) | 1 | No | No | 0.28 | Q3 |
| International Journal of Child-Computer Interaction | 1 | No | No | 1.03 | Q1 |
| International Journal of Educational Methodology | 1 | No | No | No | No |
| International Journal of Learning, Teaching and Educational Research | 1 | No | No | No | No |
| International Journal of Recent Technology and Engineering (IJRTE) | 1 | No | No | No | No |
| International Journal on Informatics Visualization | 1 | No | No | 0.18 | Q4 |
| Journal of Computer and Mathematics Education | 1 | No | No | No | No |
| Journal of Research on Technology in Education | 1 | 0.28 | Q1 | 1.08 | Q1 |
| Journal of Science Education and Technology | 1 | 0.16 | Q1 | 1.15 | Q1 |
| Mathematics Teaching Research Journal | 1 | 0.26 | Q2 | 0.15 | Q4 |
| Pacific Rim Psychology | 1 | 0.62 | Q3 | 0.5 | Q2 |
| Participatory Educational Research | 1 | No | No | 0.25 | Q3 |
| Information and Technology in Education and Learning (ITEL) | 1 | No | No | No | No |
| Revista Iberoamericana de Evaluación Educativa | 1 | No | No | No | No |
| Sosyal Bilimler Enstitüsü Dergisi | 1 | No | No | No | No |
| Sustainability | 1 | 0.48 | Q2 | 0.66 | Q1 |
| Technology, Knowledge and Learning | 1 | No | No | 1.14 | Q1 |
| The All Ireland Journal of Teaching and Learning in Higher | 1 | No | No | No | No |
| Journal of the Human and Social Sciences Researches | 1 | No | No | No | No |
| Thinking Skills and Creativity | 1 | 0.23 | Q1 | 1.16 | Q1 |
| Transactions on Computing Education | 1 | 0.55 | Q3 | 0.99 | Q1 |
Figure 8 summarizes the quartiles assigned to the journals in which the articles were published. A total of 40 different journals were identified, of which 47.5 % have been classified in a JCR quartile and 65 % have been classified with the SJR indicator.
3.2 Analysis of the variables of interest
This systematic literature review included 52 articles, of which only 40 introduced new instruments. The remaining 12 articles examined adaptations of the latter. Table 4. lists the instruments, the reference to the original instrument, and the reference to the adapted version.
| Instrument | Original reference | Adapted reference | Adaptation |
| Holistic Assessment of Computational Thinking (Hi-ACT) | [ |
[ |
This article confirmed the psychometric properties of the instrument (validity and reliability). In total, 41 items were removed from the instrument. Ten constructs were evaluated: abstraction, algorithmic thinking, decomposition, debugging, generalization, evaluation, problem-solving, teamwork, communication, and spiritual intelligence. |
| Computational Thinking Scale (CTS) | [ |
[ |
This paper confirmed the construct validity of the CTS and its five dimensions. Two factors were identified: (1) creative thinking ability, cooperativity, and critical thinking skills and (2) algorithmic thinking. |
| [ |
This paper confirmed the construct validity of the CTS and its five dimensions. The wording of six questions in the scale was adapted because they were written from a negative perspective. | ||
| [ |
This study confirmed the construct validity of the CTS and its five dimensions. In the process of translating the scale, the authors determined the consistency between the structures in the original language and those in Chinese. | ||
| [ |
This study confirmed the psychometric properties of the instrument (validity and reliability). Back-translation was used to verify the consistency between the structures in the original language and those in Chinese. The wording of items about problem-solving was changed. | ||
| [ |
This paper confirmed the construct validity of the CTS and its five dimensions. Two items were removed from the creativitydimension, one from critical thinking, and three from problem-solving. | ||
| [ |
This study confirmed the construct validity of the CTS and its five dimensions. Two factors were identified: (1) creative thinking ability, cooperativity, and critical thinking skills and (2) algorithmic thinking. | ||
| Computational Thinking Test (CTt) | [ |
[ |
Rasch scalability was applied as a technique to validate the psychometric properties of the skills in the CTt. Likewise, the Item Response Theory (IRT) was employed to verify the objectivity of the test. The CTt was not modified. |
| [ |
This study confirmed the psychometric properties of the instrument (validity and reliability). Back-translation was used to verify the consistency between the structures in English and those in Turkish. | ||
| [ |
This article examined the predictive validity of the CTt with respect to academic performance and learning on a virtual platform (code.org). | ||
| [ |
This study confirmed the reliability of the instrument. Expert judgement was applied for the validation, and the final version had 28 items. | ||
| [ |
The Item Response Theory (IRT) was used to verify the objectivity of the test and the difficulty of the items. The final version had 24 items because some questions about conditionals and loops were left out. | ||
| [ |
Rasch scalability was applied as a technique to validate the psychometric properties of the skills in the CTt. The final version had 28 items because some questions about conditionals and loops were left out. | ||
| Computational Thinking Disposition Instrument (CTDI) | [ |
[ |
This study confirmed the psychometric properties of the instrument (validity and reliability). Nine items were removed from the cognitiveand affectivedimensions. |
The evidence reported above regarding the CTS suggests that all the adapted versions of this instrument underwent thorough validation of their psychometric properties. In the case of the CTt, there have been some linguistic adaptations and changes to a number of items.
3.2.1 Type of tool
The 40 CT assessment tools analyzed in this paper can be classified as shown in Figure 9. This classification is based on the name given by each author in their article. Scales were the most common format (28 %), followed by assessments (22.5 %) and tests (22.5 %).
3.2.2 Number of items
In these articles, 5.8 % of the instruments have up to ten items; 73 %, between 10 and 30 items; 9.7 %, between 31 and 40; and 11.5 %, more than 40.
3.2.3 Study population
In terms of study population, 25 % of the papers focused on college students, 3.8 % on teachers in training, 35 % on high school students, 21 % on primary students, and 6% on early childhood education. Among these publications, 5.7 % are about teachers. Figure 10 presents the frequency of each type of study population.
3.2.4 Skills/competencies assessed in CT
The diversity of definitions of CT indicates that the articles have addressed this construct from the perspectives of different skills or competencies. The most frequent skills/competencies they have discussed are abstraction, logarithmic thinking, problem-solving, decomposition, debugging, algorithms, and modularizing. Based on these 40 instruments, abstraction, logarithmic thinking, problem-solving, debugging, modularizing, and affective competencies have been evaluated since 2017. Decomposition was included in 2018. Some of the instruments assess cognitive skills along with affective and social skills, as well as attitudes. Table 5 presents the constructs assessed in each of the 40 instruments.
| # | Instruments / Constructs assessed | Abstraction | Algorithmic thinking | Problem-solving | Decomposi-tion | Debugging | Algorithms | Modulari-zing | Affective dimensions/ Attitudes |
| 1 | Holistic Assessment of Computational Thinking (Hi-ACT) | X | X | X | X | X | X | ||
| 2 | Programming-oriented Computational Thinking Scale (P-CTS) | X | X | ||||||
| 3 | Computational Thinking Skills (CTS) scale | X | X | X | |||||
| 4 | Computational Thinking Scale (CTS) | X | X | X | |||||
| 5 | CT Skill Level Scale | X | X | X | X | X | |||
| 6 | Tufts Assessment of Computational Thinking in Children-KIBO robot version (TACTIC-KIBO) | X | X | X | X | ||||
| 7 | Computer-based assessment | X | |||||||
| 8 | Computational Thinking Skills Scale | X | X | X | |||||
| 9 | CT Test (CTt) | X | X | X | X | X | |||
| 10 | Computational Thinking Self-Efficacy Scale | X | X | ||||||
| 11 | Computational Thinking Disposition Questionnaire | X | |||||||
| 12 | Computational Thinking Assessment of Chinese Elementary School Students (CTA-CES) | X | X | X | |||||
| 13 | Evaluación del PC basado en la resolución de problemas complejos [CT evaluation based on complex problem-solving] | X | |||||||
| 14 | Computational Thinking Disposition Instrument (CTDI) | X | |||||||
| 15 | Generic test to assess CT practices | X | X | X | |||||
| 16 | Assessment using card-based games | X | X | X | |||||
| 17 | Triangle examination using Bebras Challenge | X | X | ||||||
| 18 | Assessment of Computational Thinking in Early Childhood (TechCheck) | X | X | X | |||||
| 19 | Competent CT Test (cCTt) | X | |||||||
| 20 | Computational Thinking Scale (CTS) for computer literacy education | X | X | X | |||||
| 21 | Computational Thinking Competency Assessment (CTCA) | X | X | ||||||
| 22 | Computational Thinking Test Tool from Existing Models | X | |||||||
| 23 | Computational Thinking Concepts Test for Primary Education Adopting an ECD Approach | X | |||||||
| 24 | Computational Thinking Concepts Assessment | X | X | X | X | ||||
| 25 | Mathematical Computational Thinking Skill Test | X | X | X | |||||
| 26 | Algorithmic Thinking Test for Adults (ATTA) | X | X | X | X | ||||
| 27 | CT test, questionnaire, and interview | X | |||||||
| 28 | Questionnaire to assess CT components in teachers | X | X | ||||||
| 29 | College Students' Computational Thinking Multidimensional Test | X | X | X | |||||
| 30 | Computer Programming Self-Efficacy Scale (CPSES) | X | X | ||||||
| 31 | Instrument Test for Computational Thinking Skills Based on the Realistic Mathematics Education (RME) Approach | X | X | ||||||
| 32 | Computational Thinking Scale (CTS) | X | X | X | |||||
| 33 | Questionnaire of Computational Thinking (QCT) | X | X | X | |||||
| 34 | Scale of Self-Efficacy Perception Towards Teaching Computational Thinking | X | X | X | |||||
| 35 | Teacher Beliefs about Coding and Computational Thinking (TBaCCT) | X | X | X | X | ||||
| 36 | Teacher Efficacy and Attitudes Towards STEM for Teaching Computational Thinking (T-STEM-CT) | X | |||||||
| 37 | Assessment Tool for Measuring Computational Thinking Skills | X | X | X | X | ||||
| 38 | Early assessment | X | X | ||||||
| 39 | Computational Thinking Test for Elementary School Students (CTT-ES) | X | X | X | |||||
| 40 | Beginners’ CT test (BCTt) | ||||||||
| 17 | 13 | 12 | 16 | 9 | 14 | 7 | 11 |
3.2.5 Validity and reliability
It was found that 87 % of the instruments showed evidence of validity; and 69 %, evidence of reliability. Figures 11 and 12 display the types of validity and reliability reported in the articles.
Regarding the evidence of reliability, 67 % of the articles refer to internal consistency, 30 % do not specify the method used, 5 % refer to test-retest reliability, and 2 % mention alternate-form reliability. Table 6 details the types of validity and reliability employed in each paper.
| Validity | Reliability | ||||||||||
| Year | Instrument | Article title | Sample | Content | Construct | Criterion | No evidence | Test-retest | Internal consistency | Alternate form | No evidence |
| 2017 | Computational Thinking Scale (CTS) | A validity and reliability study of the computational thinking scales (CTS) | 580 | X | X | ||||||
| 2017 | CT Test (CTt) | Which cognitive abilities underlie computational thinking? Criterion validity of the Computational Thinking Test | 1521 | X | X | ||||||
| 2018 | Computational Thinking Skills (CTS) scale | A valid and reliable tool for examining computational thinking skills | X | X | X | X | |||||
| 2018 | Computational Thinking Test (CTt) | Can computational talent be detected? Predictive validity of the Computational Thinking Test | 314 | X | X | ||||||
| 2018 | Scale of Self-Efficacy Perception Towards Teaching Computational Thinking | The scale of self-efficacy perception towards teaching computational thinking: a validity and reliability study | 378 | X | X | ||||||
| 2019 | Holistic Assessment of Computational Thinking (Hi-ACT) | A proposal for holistic assessment of computational thinking for undergraduate: Content validity | 0 | X | X | ||||||
| 2019 | Computational Thinking Scale (CTS) | Adapting computational thinking scale (CTS) for Chinese high school students and their thinking scale skills level | 1015 | X | X | X | |||||
| 2019 | Computational Thinking Self-Efficacy Scale | Computational thinking self-efficacy scale: Development, validity, and reliability | 319 | X | X | X | |||||
| 2019 | Questionnaire to assess CT components in teachers | Computational thinking for preservice teachers in Thailand: A confirmatory factor analysis. | 747 | X | X | ||||||
| 2019 | Computer Programming Self-Efficacy Scale (CPSES) | Developing the Computer Programming Self-Efficacy Scale for Computer Literacy Education | 106 | X | X | ||||||
| 2019 | Computational Thinking Scale (CTS) | Development of Computational Thinking Scale: Validity and Reliability Study | 426 | X | X | X | |||||
| 2019 | Triangle examination using Bebras Challenge | Multivocal Challenge Toward Measuring Computational Thinking: Bebras Challenge Versus Computer Programming | 150 | X | X | ||||||
| 2019 | Computational Thinking Test Tool from Existing Models | Toward developing a real-world computational thinking test tool from existing models | 204 | X | X | ||||||
| 2020 | Holistic Assessment of Computational Thinking (Hi-ACT) | A Pilot Study of an Instrument to Assess Undergraduates’ Computational Thinking Proficiency | 548 | X | X | ||||||
| 2020 | Programming-oriented CTS (P-CTS) | A Valid and Reliable Scale for Developing Programming-Oriented Computational Thinking | 360 | X | X | ||||||
| 2020 | Adaption of the Computational Thinking Test | Adaption of the computational thinking test into Turkish | 502 | X | X | ||||||
| 2020 | Computational Thinking Skills Scale | The Development of Computational Thinking Skills Scale: Validity and Reliability Study | 254 | X | X | X | |||||
| 2020 | Computational Thinking Disposition Questionnaire | Development and Predictive Validity of the Computational Thinking Disposition Questionnaire | 907 | X | X | ||||||
| 2020 | Computational Thinking Assessment of Chinese Elementary School Students (CTA-CES) | Development and Validation of Computational Thinking Assessment of Chinese Elementary School Students | 280 | X | X | X | X | ||||
| 2020 | Assessment of Computational Thinking in Early Childhood (TechCheck) | TechCheck: Development and Validation of an Unplugged Assessment of Computational Thinking in Early Childhood Education | 768 | X | X | ||||||
| 2020 | Computational Thinking Test | Analysis of a Novel Computational Thinking Test in First Year Undergraduate Computer Science Course | 292 | X | X | ||||||
| 2021 | Adapted Computational Thinking Test (CTt) | A comprehensive assessment of secondary school students computational thinking skills. | 328 | X | X | ||||||
| 2021 | Computational Thinking Concepts Assessment | A principled approach to designing computational thinking concepts and practices assessments for upper elementary grades | 5698 | X | X | X | |||||
| 2021 | Assessment Tool for Measuring Computational Thinking Skills | An alternative approach for measuring computational thinking: Performance-based platform | 156 | X | X | X | X | ||||
| 2021 | Adapted Computational Thinking Test (CTt) | Assessing computational thinking abilities among Singapore secondary students: a Rasch model measurement analysis | 153 | X | X | ||||||
| 2021 | Computer-based assessment | Beyond Programming: A Computer-Based Assessment of Computational Thinking Competency | 119 | X | X | ||||||
| 2021 | CT test, questionnaire, and interview | Computational thinking evaluation tool development for early childhood software education | 0 | X | X | ||||||
| 2021 | Early assessment | Design and validation of learning trajectory-based assessments for computational thinking in upper elementary grades | 144 | X | X | ||||||
| 2021 | Instrument Test for Computational Thinking Skills Based on the Realistic Mathematics Education (RME) Approach | Development of Instrument Test Computational Thinking Skills IJHS/JHS Based RME Approach | 102 | X | X | ||||||
| 2021 | Adapted Computational Thinking Test (CTt) | Computational thinking in elementary and middle school students | 176 | X | X | ||||||
| 2021 | Evaluación del PC basado en la resolución de problemas complejos [CT assessment based on complex problem-solving] | Evaluar el PC mediante Resolución de Problemas: Validación de un Instrumento de Evaluación. (Spanish) | 38 | X | X | X | |||||
| 2021 | Questionnaire of Computational Thinking (QCT) | Examination of Turkish Middle School STEM Teachers' Knowledge about Computational Thinking and Views Regarding Information and Communications Technology | 121 | X | X | ||||||
| 2021 | Generic test to assess CT practices | Item response analysis of computational thinking practices: Test characteristics and students’ learning abilities in visual programming contexts | 13956 | X | X | ||||||
| 2021 | Assessment using card-based games | Measuring coding ability in young children: relations to computational thinking, creative thinking, and working memory | 15 | X | X | X | X | X | |||
| 2021 | Teacher Beliefs about Coding and Computational Thinking (TBaCCT) | Measuring teacher beliefs about coding and computational thinking | 245 | X | X | ||||||
| 2021 | Teacher Efficacy and Attitudes Towards STEM for Teaching Computational Thinking (T-STEM-CT) | Measuring in-service teacher self-efficacy for teaching computational thinking: development and validation of the T-STEM CT | 330 | X | X | ||||||
| 2021 | Computational Thinking Scale (CTS) for computer literacy education | The Computational Thinking Scale for Computer Literacy Education | 388 | X | X | ||||||
| 2022 | Mathematical Computational Thinking Skill Test | Analysis of Content Validity on Mathematical Computational Thinking Skill Test for Junior High School Student Using Aiken Method | 7 | X | X | ||||||
| 2022 | Algorithmic Thinking Test for Adults (ATTA) | Assessing Computational Thinking: Development and Validation of the Algorithmic Thinking Test for Adults | 289 | X | X | ||||||
| 2022 | Tufts Assessment of Computational Thinking in Children-KIBO robot version (TACTIC-KIBO) | Assessing young Korean children’s computational thinking: A validation study of two measurements | 450 | X | X | X | X | ||||
| 2022 | Beginners’ CT test (BCTt) | Comparing the psychometric properties of two primary school Computational Thinking (CT) assessments for grades 3 and 4: The Beginners’ CT test (BCTt) and the competent CT test (cCTt) | 575 | X | X | ||||||
| 2022 | Computational Thinking Test (CTt) | Computational Thinking Assessment – Towards More Vivid Interpretations | 202 | X | X | ||||||
| 2022 | College Students’ Computational Thinking Multidimensional Test | Developing College students’ computational thinking multidimensional test based on Life Story situations | 450 | X | X | ||||||
| 2022 | Computational Thinking Test for Elementary School Students (CTT-ES) | Development and Validation of the Computational Thinking Test for Elementary School Students (CTT-ES): Correlate CT Competency With CT Disposition. | 631 | X | X | X | X | ||||
| 2022 | Adapted self-report scale | Development of the Japanese Version of the Computational Thinking Scales for First-Year University Students in Humanities | 511 | X | X | X | |||||
| 2022 | Scale of CT Skill Levels | Evaluation and developmental suggestions on undergraduates’ computational thinking: a theoretical framework guided by Marzano’s new taxonomy | 737 | X | X | ||||||
| 2022 | Computational Thinking Disposition Instrument (CTDI) | Exploratory and Confirmatory Factor Analysis for Disposition Levels of Computational Thinking Instrument Among Secondary School Students | 500 | X | X | X | |||||
| 2022 | Computational Thinking Disposition Instrument (CTDI) | Gender differential item functioning analysis in measuring computational thinking disposition among secondary school students | 500 | X | X | X | |||||
| 2022 | Thai Self-Rating Version of the Computational Thinking Scale | Reliability and Construct Validity of Computational Thinking Scale for Junior High School Students: Thai Adaptation | 3241 | X | X | X | |||||
| 2022 | Competent CT Test (cCTt) | The competent Computational Thinking Test: Development and Validation of an Unplugged Computational Thinking Test for Upper Primary | 1519 | X | X | X | |||||
| 2022 | Computational Thinking Competency Assessment (CTCA) | The Use of Cognitive Diagnostic Modeling in the Assessment of Computational Thinking | 564 | X | X | ||||||
| 2022 | Computational Thinking Concepts Test for Primary Education Adopting an ECD Approach | Validating a computational thinking concepts test for primary education using item response theory: An analysis of students’ responses | 13670 | X | X |
This systematic literature review aimed to identify studies that have introduced instruments for measuring CT, as well as bibliometric variables and other variables of interest to delve deeper into this object of study.
A total of 52 research papers and 15 meta-analyses, mapping reviews, and systematic literature reviews were selected. Four studies are noteworthy ([
For this purpose, it was necessary to establish what methods have been used to determine their validity and reliability. This process can be addressed from the perspective of the Classical Test Theory (CTT) and the Item Response Theory (IRT). CTT is based on methods that evaluate the quality of tests by measuring the internal consistency and validity of the content, criterion, and construct. On the other hand, the IRT offers a more advanced approach as it considers the individual characteristics of the items and participants, enabling a more accurate estimation of the skills under evaluation and a more sensitive assessment of performance. Integrating both theories allows for a more comprehensive and reliable evaluation of tests, facilitating decision-making in various educational and professional contexts. This review includes several articles that refer to adaptations of two instruments: The Computational Thinking Test (CTt) and the self-report Computational Thinking Scale (CTS). All the adaptations of the CTS [
Considering the authors of the 52 articles and those most cited within them, Marcos Román-Gonzaléz and Özgen Korkmaz were found to be at the top of both lists, demonstrating their extensive research experience in CT assessment.
The protocol for this review included six questions that can be used to delve deeper into this discussion:
| # | Title | Analysis |
| 1 | [ |
Empirical studies on CT assessment in college students are summarized. Elements of CT assessment reviewed in this article: block-based assessments, knowledge/skill tests, self-report Likert scales, text-based programming projects, academic achievements of CS courses, as well as interviews and observations. |
| 2 | [ |
Key characteristics of CT assessments for K-12 students are identified and classified. Elements of CT assessment reviewed in this article: tangible tasks, programming projects, self-report Likert scales, and single- and multiple-choice questions. |
| 3 | [ |
Approaches for assessing block-based programming activities for K-12 students are analyzed. Elements of CT assessment reviewed in this article: block-based assessments and programming projects that emphasize computational concepts. |
| 4 | [ |
CT implementation contexts and CT assessment tools across all educational levels are reviewed. Elements of CT assessment reviewed in this article: portfolio, interviews, knowledge tests, and a combination of tools. |
| 5 | [ |
This article analyzes the relationship between CT and academic performance in primary school students. Elements of CT assessment reviewed in this article: academic performance and knowledge tests. |
| 6 | [ |
Existing CT studies with pre-school age participants are examined. Elements of CT assessment reviewed in this article: block-based assessments and computational concepts and perspectives. |
| 7 | [ |
This paper describes the different ways in which CT has been operationalized and implemented in practice. Elements of CT assessment reviewed in this article: computational concepts, practices, and perspectives. |
| 8 | [ |
Educational contexts where CT has been implemented are presented, highlighting the ways in which CT can be assessed/measured. Elements of CT assessment reviewed in this article: computational concepts, practices, and perspectives. |
| 9 | [ |
This study investigates the relationship between CT skills development in learning settings, conceptual understanding, and CT-related dimensions. Elements of CT assessment reviewed in this article: programming-related and non-programming activities. |
| 10 | [ |
A conceptual model is designed for six CT areas: knowledge, learning strategies, assessment, tools, factors, and capacity building. Elements of CT assessment reviewed in this article: self-report scales, tests, artifact analysis, and observations. |
| 11 | [ |
Robots and processes used in CT assessment are reviewed. Elements of CT assessment reviewed in this article: portfolio, tests, and surveys. |
| 12 | [ |
Research trends in the field of CT are analyzed. Elements of CT assessment reviewed in this article: computational concepts, practices, and perspectives. |
| 13 | [ |
This review proposes the operationalization of abstraction in the context of CT. Elements of CT assessment reviewed in this article: abstraction and generalization. |
| 14 | [ |
This study establishes the impact of programming teaching on K-12 students’ CT skills. Elements of CT assessment reviewed in this article: programming tools. |
| 15 | [ |
This study determines the properties of the instruments developed to measure CT. Elements of CT assessment reviewed in this article: psychometric properties and thinking skills. |
Skills, attitudes, and perceptions are dimensions widely used to measure CT. Some studies [
This study reviewed 52 research articles about CT assessment and measurement published between 2012 and 2022 in academic journals. Additionally, it analyzed scoping reviews, systematic mapping reviews, and literature reviews, which revealed (a) the interest in consolidating the evidence on CT assessment and (b) research gaps for this review. Consequently, this literature review was conducted to learn about the psychometric properties of CT assessment and measurement instruments, as well as CT-related variables.
This systematic review implemented a process that ensures the repeatability of the review protocol. The research questions helped to define the limits of the bibliometric variables and variables of interest. The bibliometric variables indicate that the number of articles on CT measurement instruments has increased since 2019. Most documents on CT measurement have been published in Türkiye, the US, and China. Based on the JCR index, 27.5 % of the articles were published in Q1 journals; and based on the SJR index, 47.5 % were featured in Q1 outlets. There is no evidence of CT measurement instruments in Colombia. Regarding key authors, Brennan and Resnick, as well as Selby and Woollard, are commonly cited due to their widely recognized CT curriculum designs. Marcos Román González and Özgen Korkmaz stand out for their numerous publications and the international adaptation of their instruments. The Computational Thinking Scale (CTS) has been adapted and psychometrically validated in Europe and Asia, while the Computational Thinking Test (CTt) has been adapted and its validity has been established in the same regions. However, no adaptations of these instruments to Latin American countries were identified. As the instruments were mostly applied to high school and college students, future research should address other populations, such as young children or adults.
The results of this review highlight the diverse range of CT skills that can be evaluated. Among these skills (that have a multidimensional origin), algorithmic thinking, cognitive skills, and problem-solving capabilities are the most common. Computational capabilities are also widely assessed, especially concepts that are directly related to computer programming, such as sequences, conditionals, loops, and events. It should be noted that abstraction has been commonly evaluated across all populations, but there is little scientific evidence of a rigorous evaluation of this construct. Only Ezeamuzie et al. have formally operationalized this skill.
This review revealed a variety of instruments to measure CT—with scales being the most frequently used format. This suggests that CT should be assessed in a comprehensive manner by addressing a wide range of associated skills, concepts, attitudes, and procedures. Most reviewed instruments demonstrated both validity and reliability, with content and construct validity, as well as internal consistency, being the predominant criteria. The statistical methods most commonly employed to analyze these properties are correlation, Exploratory Factor Analysis (EFA), Confirmatory Factor Analysis (CFA), and Cronbach’s alpha.
This literature review makes a contribution to future studies by demonstrating the progress made in CT assessment through the use of measurement instruments with strong psychometric properties. In conclusion, this review accomplished its objective, i.e., it identified the tools that have been used to measure CT, along with their psychometric properties and the skills they assess.
The authors thank the Ph.D. program in Education Sciences at Universidad del Quindío, which provided funding for Milena Corrales Álvarez and Lina Marcela Ocampo to complete their doctoral studies.
They would also like to thank Universidad del Quindío for funding the research project (Code 1187 of the Call for Proposals No. 14 of 2022).
The authors declare that there is no conflict of interest.
Milena Corrales-Álvarez: Methodology, Conceptualization, Investigation, and Writing - Review & Editing.
Lina Marcela Ocampo: Methodology, Conceptualization, Investigation, and Writing - Review & Editing.
Sergio Augusto Cardona-Torres: Conceptualization, Investigation, and Writing - Review & Editing.