Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Hallucination Detection in LLMs: Fast and Memory-Efficient Finetuned Models
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology. Department of Informatics, University of Oslo.ORCID iD: 0009-0009-7464-2093
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Systems and Control. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Automatic control. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Artificial Intelligence.ORCID iD: 0000-0001-5183-234X
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Systems and Control.ORCID iD: 0000-0001-8182-0091
2024 (English)Manuscript (preprint) (Other academic)
Abstract [en]

Uncertainty estimation is a necessary component when implementing AI in high-risk settings, such as autonomous cars, medicine, or insurances. Large Language Models (LLMs) have seen a surge in popularity in recent years, but they are subject to hallucinations, which may cause serious harm in high-risk settings. Despite their success, LLMs are expensive to train and run: they need a large amount of computations and memory, preventing the use of ensembling methods in practice. In this work, we present a novel method that allows for fast and memory-friendly training of LLM ensembles. We show that the resulting ensembles can detect hallucinations and are a viable approach in practice as only one GPU is needed for training and inference.

Place, publisher, year, edition, pages
2024.
Keywords [en]
LLM, AI, Bayesian ensemble
National Category
Computer Systems
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:uu:diva-543911OAI: oai:DiVA.org:uu-543911DiVA, id: diva2:1916493
Conference
Northern Lights Deep Learning 2025
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)Kjell and Marta Beijer FoundationNational Academic Infrastructure for Supercomputing in Sweden (NAISS), 2022-06725EU, European Research Council, 101054643Available from: 2024-11-27 Created: 2024-11-27 Last updated: 2024-11-28Bibliographically approved

Open Access in DiVA

fulltext(835 kB)277 downloads
File information
File name FULLTEXT01.pdfFile size 835 kBChecksum SHA-512
0c3922e7f3a7ea69b33705e93ba2218c51aaa497789b15e98ef932e0e06fc85b8cea4d4774ae4bbade817c4f80202b9ddbd3d8318d339e1b670c1e892a592e0e
Type fulltextMimetype application/pdf

Authority records

Arteaga, Gabriel Y.Schön, Thomas B.Pielawski, Nicolas

Search in DiVA

By author/editor
Arteaga, Gabriel Y.Schön, Thomas B.Pielawski, Nicolas
By organisation
Department of Information TechnologyDivision of Systems and ControlAutomatic controlArtificial Intelligence
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 277 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 367 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf