Title: How Open is Open? Transparency
and Accountability in Open-Source LLMs.
Speaker: Frank Coyle, SMU
Beyond Marketing Claims
The world of Natural Language Processing (NLP) has seen a significant
transformation with the advent of Large Language Models (LLMs) such as ChatGPT.
Companies implement ChatGPT in a
variety of ways including customer service, sentiment analysis and marketing, while
researchers are exploring its use in areas such as natural language processing,
psychology, and linguistics.
Despite its widespread use, ChatGPT has some major drawbacks. For
example, it is known to generate factually incorrect responses, often referred
to as hallucinations. In addition, the models often exhibit a variety of biases
– all based on the data used to build the model.
The presentation will emphasize the need for genuine openness in
open-source AI software and examine the implications of models that may not
fully disclose their data sources and algorithms.
A recent study of AI software (Nolan, 2023) found numerous instances of software
claiming to be open source failed to provide clear details about the source of
their training data and the underlying algorithms The audience will gain
insights into why knowing the origin of training data is vital. Hidden data
sources can introduce biases and reinforce inequalities, which can have
real-world consequences.
When open-source LLMs keep their algorithms proprietary, it becomes
challenging to evaluate and scrutinize their operations, leading to a lack of
accountability. Attendees will learn how proprietary algorithms can hinder the
identification and correction of algorithmic biases. Issues of fairness,
accountability, and responsible AI are central themes. Attendees will gain a
deeper understanding of the risks posed by models that are not as open as they
claim to be.
The presentation concludes by advocating for genuine transparency and
accountability in open-source LLMs. Attendees will leave with a call to action,
encouraging them to support projects that adhere to open-source principles in
both word and spirit.
Nolan, Nichael. Llama and ChatGPT Are Not Open-Source. IEEE Spectrum, July,
2023