VET4Youth: Mapping digital skills in Initial Vocational Education and Training curricula using Natural Language Processing techniques

Start Date: 10/12/2024
End Date: 31/12/2027
Funding: European Centre for the Development of Vocational Training (CEDEFOP)
Project Leader: Papageorgiou Harris

The VET4Youth project maps and analyses the integration of digital skills in Initial Vocational Education and Training (IVET) curricula across eight EU Member States. Focusing on the intended curricula, it uses Natural Language Processing (NLP) to extract, classify and compare digital skills across countries and sectors, leveraging established European frameworks such as ESCO and DigComp.

The methodological pipeline includes: (a) collection, cleaning, normalisation and translation of curricula into machine-readable formats; (b) skill extraction through semantic retrieval, Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), contextual analysis and expert validation; (c) development of a new taxonomy extending ESCO and DigComp; and (d) a validation interface for expert review and curation.

The technical infrastructure integrates semantic similarity models, LLMs, RAG and Machine Translation (MT) models, ensuring multilingual comparability, transparency, and high-quality results across all participating countries.

The project provides robust, cross-country insights and good practices to inform policy design and curriculum innovation for strengthening digital skills provision in IVET, and will analyse gaps between the supply of digital skills in IVET curricula and labour market demand, linking the findings to Cedefop’s Skills-OVATE and the European Skills and Jobs Survey (ESJS).

ILSP is the coordinator of the VET4Youth project, leading its scientific design, coordination and overall implementation. It oversees the data collection from participating countries and develops and maintains the NLP infrastructure, including the data processing and translation pipeline, the skill retrieval and extraction engine, and the validation interface used by experts. ILSP also leads the development of the taxonomy for emerging digital skills and the analysis of skills mismatches between education and labour market needs, ensuring the scientific consistency and quality of all outputs. It is responsible for the preparation of the project’s deliverables and analytical reports, covering methodology, datasets, country analyses, and policy insights. ILSP further coordinates the network of national VET experts involved in reviewing and validating results, ensuring that the analysis reflects the specific contexts of each participating country and supports the project’s cross-country comparability and policy relevance.