Imagine a public competition that includes queries regarding Italian language and culture, such as those related to pursuing a career in teaching or public administration. Assume that, at the time of publication of the results, the Italian candidates barely pass, while foreign "competitors", especially the Americans, dominate the list with extremely high scores. It appears to be a provocation, yet this is exactly what a group of researchers from Milan's Bicocca University discovered in their study. The project, called ITALIC (Italian Language and Culture), created a new benchmark to test AI models’ competence in Italian language and culture. 10,000 questions from real Italian public competitions were gathered and chosen to test linguistic models, both Italian and international, on topics ranging from Dante to lexicon, geography to grammar. The results are clear: Italian models are falling behind. None exceed 70 points, whereas foreign ones are considerably above 80, reaching 90 in the case of GPT-4o (OpenAI) and Claude 3.5 Sonnet (Anthropic). European models like Mistral (France) and LLaMA (Meta) also perform better. In contrast, Italian LLMs, including Velvet (Almawave), Minerva (Sapienza), Modello Italia (iGenius), and Miia (Fastweb), exhibit a substantial disparity, despite their specific Italian training. "The problem is not the quality of the work done, but the scale", says Fabio Mercorio, a professor of computer science at Bicocca University and one of the study's authors. "Italian models have between 7 and 14 billion parameters. The American ones total hundreds of billions. It is comparable to contrasting a competent local physician with an American who has had access to ten times more data: he can do everything, even in Italian. The problem extends beyond technological dispute. At a time when initiatives to create "national" AI models are multiplying in Italy, the study raises a strategic question: is it realistic to expect to bridge the gap solely through tailored training, without having access to infrastructure, investments, and data comparable to those of international giants? The goal of ITALIC, according to Mercorio, is not to penalize Italian models, but rather to provide a public, transparent, and verifiable standard for objectively assessing the state of the art. A useful reference for firms, organizations, and investors who will need to determine whether to use Italian or international solutions in the coming years. "In Italy there are still no companies capable of competing alone with the big Americans", stated Mercorio. "If artificial intelligence is really a strategic issue, there is only one way: to join forces at the European level".
|