alt.ai develops LLM hallucination automatic scoring engine

alt.ai develops LLM hallucination automatic scoring engineーAutomatic hallucination scoring enables detection of false output occurrences in generative AI

Automatic hallucination scoring enables detection of false output occurrences in generative AI

TOKYO, JAPAN, May 14, 2024 /EINPresswire.com/ -- alt Inc. (https://alt.ai/en/), the Japan-based developer and distributor of Personal Artificial Intelligence (P.A.I.^®️) and AI clone technology (head office: Minato-ku, Tokyo; CEO: Kazutaka Yonekura), is pleased to announce that we have successfully developed a method for scoring hallucinations in large language models (LLMs).

Hallucination is a phenomenon in which LLMs give false answers that are unjustified—or not based on fact, but on incorrectly interpreted training or input data. Such incorrect output can cause serious trust issues for companies and individuals, as well as present a significant barrier to future applications of LLMs.

alt has been a pioneer in the development and provision of LLMs in Japan, and has leveraged its experience toward research and development to solve the hallucination problem. Recently, alt has developed its own method to automatically evaluate the probability of hallucination (hallucination score), using this technology to build an automatic hallucination score evaluation engine.

The engine achieved an accuracy of 72% in a hallucination detection task on a pseudo-evaluation set created from the JcommonsenseQA dataset. It's already capable of scoring hallucination for various LLMs such as GPT-3.5 and Llama2—as well as LHTM-OPT, a lightweight large language model developed by alt.

In addition, the automatic hallucination score evaluation engine emphasizes consistency in its evaluation of LLM outputs. Specifically, it performs multiple generation processes based on the same input data and compares these results. Through this approach, discrepancies and inconsistencies in the generated content are identified, and based on these, a probabilistic assessment is made as to whether hallucination, i.e., inaccurate production not based on training data or facts, has occurred.

The automatic hallucination score evaluation engine is available now through our alt developer API service.（alt Developer：https://developer.alt.ai/api-doc.html#tag/LHTM-OPT/operation/lhtm-opt-completion）

For more information on this and other projects utilizing LLMs, please reach out to the alliance contact point below.

▶Hallucination score measurement application demo video
https://youtu.be/-_k-SDIPje4

■About alt Inc.
Founded in November 2014, alt is a startup that "aims to free people from unproductive labor" by creating P.A.I.^®️ (Personal Artificial Intelligence) and AI clones. We also develop and provide various AI products that utilize our variety of foundational AI technologies, including generative AI, a proprietary LLM, and speech recognition technologies. As of April 2024, alt has raised over 10 billion yen.
https://alt.ai/en

<Media Inquiries to:>
Misako Nishizawa (Media Relations)
e-mail: press@alt.ai

<Alliance Inquiries to:>
We provide AI solutions and support regardless of genre, including IT, finance, construction, logistics, media, manufacturing, retail, and service industries.
Please feel free to contact us.

Katsuya Asai (AI Solutions Business Department)
e-mail: gptsolutions@alt.ai

Misako Nishizawa
alt Inc.
+81 3-6455-4677
press@alt.ai
Visit us on social media:
Facebook
Twitter
LinkedIn
YouTube
Other

Hallucination score measurement application demo video

You just read:

alt.ai develops LLM hallucination automatic scoring engine

Distribution channels: Business & Economy, Companies, IT Industry, Technology, Telecommunications

EIN Presswire's priority is source transparency. We do not allow opaque clients, and our editors try to be careful about weeding out false and misleading content. As a user, if you see something we have missed, please do bring it to our attention. Your help is welcome. EIN Presswire, Everyone's Internet News Presswire™, tries to define some of the boundaries that are reasonable in today's world. Please see our Editorial Guidelines for more information.