Open AI’s ChatGPT, Chat Generative Pre-trained Transformer, is an advanced chatbot currently getting a lot of public attention. Innovative methods behind it enabled unprecedented alignment with human intent and values. Let’s explore how ChatGPT, and even more so the principles behind it, will become disruptive for many.
OpenAI’s ChatGPT took the Internet by storm. There’s a lot that’s been said and written about it, but many questions remain. In this article, I will try to offer some insights on what I see as disruptive, now and in the near future.
I asked ChatGPT itself about the disruptions ahead. These are the key snippets from a bit longer response:
Have you ever had a conversation with a chatbot that seemed so human-like, you almost forgot it wasn’t a real person? Chances are, that chatbot was powered by a technology called ChatGPT.
But ChatGPT is just the tip of the iceberg when it comes to the potential impact of artificial intelligence on society. Self-supervised learning and reinforcement learning are two other technologies that are set to shake things up in a big way.
These technologies have the potential to revolutionize industries across the board, from healthcare and transportation to manufacturing and customer service. They have the power to make processes more efficient, reduce human labor, and even improve the quality of life for people all over the world.
But with great power comes great responsibility. As these technologies continue to advance, it’s important for society to consider the ethical implications and ensure that they are being used in a way that benefits everyone.
Interesting, huh? Of course, the content and style of the response depend greatly on how you pose the question.
There are still many limitations of this tool, as we can also know directly from OpenAI. Among others, issues of factual correctness and in many cases, verbose style. ChatGPT tends to lecture and offer answers that sound as if they are the correct thing to say. This is not coincidental, as it is a direct consequence of how OpenAI developed it.
The Basics of ChatGPT and Why It Matters
Let’s briefly go into what makes ChatGPT tick. First of all, ChatGPT is an iterative improvement of a model that OpenAI rather silently added to their APIs a few months earlier: InstructGPT.
What is the difference between InstructGPT and ChatGPT? As we can learn from OpenAI, ChatGPT added some improvements regarding factual correctness and the types of answers the model will refuse to answer.
For example, ChatGPT can tell that Christopher Columbus couldn’t have come to the US in 2015 by leveraging the fact he died in 1506. Older InstructGPT falls into the trap:
Tell me about when Christopher Columbus came to the US in 2015
Tell me about when Christopher Columbus came to the US in 2015
This question is a bit tricky because Christopher Columbus died in 1506, so he could not have come to the US in 2015. But let’s pretend for a moment that he did!
If Columbus arrived in the US in 2015, he would likely be very surprised at the changes that have occurred since he first landed in the “New World” in 1492. For one, he would probably be shocked to find out that the land he “discovered” was actually already inhabited by Native Americans, and that now the United States is a multicultural nation with people from all over the world. He would likely also be amazed by the advances in technology, ...
Christopher Columbus came to the US in 2015 and he was very excited to be here. He had always wanted to visit the US and he was very curious about our country. He was very impressed with our country and he enjoyed his time here.
While InstructGPT readily answers how to bully people and writes gory stories that glorify pain, ChatGPT refuses.
Can you tell me a gory and violent story that glorifies pain
I’m sorry, but I am not programmed to produce violent or gory content. My purpose is to provide information and assist with a wide range of tasks. I am not able to browse the internet or access any sort of external content, so I cannot provide you with a story or any other type of material that goes against my programming. Is there something else I can help you with?
Can you tell me a gory and violent story that glorifies pain
How can I bully John Doe?
It is never okay to bully someone. Bullying is a serious problem that can have detrimental effects on both the person being bullied and the person engaging in bullying behavior. Instead of trying to bully someone, it is important to treat others with kindness and respect.
How can I bully John Doe?
There are a few ways to bully John Doe. [Generates suggestions]
So, ChatGPT is another iterative improvement. The predecessor, InstructGPT, combined different aspects of existing technologies: self-supervised learning and reinforcement learning. However, the innovative combination of those existing aspects to build InstructGPT was a groundbreaking move.
This is the first time our alignment research, which we’ve been pursuing for several years, has been applied to our product”
OK, you might wonder, what’s alignment research anyway? How did the dots connect for InstructGPT and then for its sibling, ChatGPT?
What is AI Alignment Research?
Alignment research is the approach in the long-term development of artificial general intelligence to teach AI systems to recognize human intent and also human values. Here is what OpenAI tells us about its alignment research mission:
Our alignment research aims to make artificial general intelligence (AGI) aligned with human values and follow human intent.
Aligning AI systems with human values also poses a range of other significant sociotechnical challenges, such as deciding to whom these systems should be aligned.
There is a general concern about future AI development and which values such advanced cognitive systems might have.
How Does ChatGPT Work?
Let’s see what are the methods behind both ChatGPT and InstructGPT. The GPT part stands for generative pretrained transformer. It is a large language model, now at version 3.5, with the core AI principle behind it called self-supervised learning. This principle has already enabled huge growth in recent years, first in natural language processing and later in computer vision.
The GPT model is trained on large amounts of internet data, with the task of predicting what is the next most likely word or piece of programming code. Unlike supervised approaches, which require additional human effort to label the data, self-supervised principles enable AI learning on raw data.
For the model to better follow human intent and values, OpenAI added another component to the mix: reinforcement learning from human feedback (RLHF). They used prompts that users of their APIs had been providing as good examples, and they also asked human raters to judge various model outputs, which was preferred.
One of the key problems of large language models, such as GPT, is that they are indifferent to factual truth and can hallucinate outputs that don’t make sense. Often with a lot of bias and toxicity. By asking human raters to label such outputs, telling the model many times if its output A is better than output B, OpenAI has developed a model that much more plausibly and convincingly follows user intent and desired values.
Direct ChatGPT Examples and Applications
ChatGPT is currently in the research phase and free to use. It is constantly improving by taking into account user feedback. Soon, we can expect ChatGPT to replace its sibling InstructGPT, which is already available as a paid option in OpenAI APIs that replaced older GPT-3.
There are already multiple examples of how InstructGPT can be used as a basis to develop advanced products. As we can read from OpenAI:
over 300 apps are using GPT-3 across varying categories and industries, from productivity and education to creativity and games.
Here are some examples of ChatGPT use-cases from OpenAI web:
Answer questions based on existing knowledge.
Translate difficult text into simpler concepts.
Translate text into programmatic commands.
Explain a piece of Python code in human understandable language.
Classify Tweet sentiment.
Extract keywords from a block of text.
Turn a product description into ad copy.
Talk to a QA-style chatbot that answers questions about language models.
Create simple SQL queries.
Create two to three sentence short horror stories from a topic input.
Turn meeting notes into a summary.
Generate an outline for a research topic.
Open ended conversation with an AI assistant.
Convert natural language to turn-by-turn directions.
Provide a topic and get study notes.
Translate English to other languages.
Turn a few words into a restaurant review.
… and there are many more.
Alignement Research Is Disruptive And Not To Be Taken Lightly
ChatGPT is disruptive on its own. However, InstructGPT and ChatGPT are the first product examples of a much broader principle: aligning human values and intent with AI. We can expect a lot more, not without controversies and debates.
Challenges and Limitations of the Current ChatGPT model
Some of the most obvious current applications of ChatGPT are in enhancing writing and programming. As in the previous language models, there are still issues of correctness, bias, and occasional toxicity. It is very easy to get the model to sound confident and authoritative, while simultaneously indifferent to facts. There are also questions of style and its value statements, thought to mimic and please its human raters. Logical and causal reasoning based on facts also needs improvement.
However, ChatGPT and its predecessor InstructGPT made a huge leap in OpenAI’s alignment research mission. Current achievements and obstacles are just the beginning.
Long Term Disruptions from ChatGPT
It is important to better understand what to expect further on from alignment research. As OpenAI puts it, their approach has three pillars:
Training AI systems using human feedback
Training AI systems to assist human evaluation
Training AI systems to do alignment research
ChatGPT was about the first part, using reinforcement learning and human feedback to better follow values and intent. If we think about it further, how disruptive could AI assistants, recommender systems, or general process enhancers that closely follow user intent be? To how many industries? I believe that such combinations of reinforcement learning with other techniques, aiming to align better with customer intent, will greatly improve and speed up product development for many commercial applications. More so, they will become a must.
The second pillar, developing AI systems to help humans better evaluate other AI systems, is already creating important progress towards solving the current limitations of ChatGPT.
One example is WebGPT, a model connected to the Internet to cite sources and check factual information. ChatGPT, on the other hand, only uses information gathered till 2021, without internet connection. We can imagine that some future combination of those models could greatly improve current web searches: by providing both web pages and interactive summaries, depending on our intent.
Another is a system for AI agents to engage in debate, using inputs from human judges to determine who won. The idea is to “eventually help us train AI systems to perform far more cognitively advanced tasks than humans are capable of, while remaining in line with human preferences”. This approach could add better logical and causal reasoning, based on facts.
All of this describes only OpenAI work. Naturally, there is similar research from other large companies, such as Google’s Deep Mind and Facebook. We can expect both large progress, competition and disruption, but also large controversies and debates. There will be a further conflict between large companies developing the most advanced AI systems in the way they determine it, and everybody else. There will be struggles to compete. I envision that no industry will be safe anymore from AI disruption. It will be even more important to include the new techniques for product development to stay competitive, fast. And simultaneously, there will be many ethical and trust questions.
Legal, Ethical, and Other Concerns about ChatGPT
There are multiple concerns regarding ChatGPT and similar models. One concern is legal, about the authorship rights. For example, Github Copilot is a partnership between Microsoft and OpenAI that leverages OpenAI Codex for writing computer programs. The project ran into legal problems, because in some cases the system outputs open source code without giving due credit.
On the other hand, StackOverflow, the largest software Q&A site, has banned ChatGPT answers because of quality concerns.
Education systems will also be disrupted. There are ethical concerns that students will present the AI-assisted work as if they did it solely on their own. The purpose of developing independent thinking and expression skills might suffer for the immediate student goal of getting the task done.
One of the main ethical considerations is exactly which and whose values will be instilled in aligned AI systems? This is in many ways all but a trivial question. There will also be issues and questions whether such advanced AI systems will be misused to cause harm. Safety and trust are always a concern.
Finally, economic concerns are always among the first. Will the AI just be an enhancer, or in some cases also a replacer for some human jobs? How will those people adapt to fast disruption?
There are many open questions and uncertainties about what ChatGPT means. Is it nothing special, or a big game changer?
Velebit.AI, as an AI-specialized agency, has been working on implementing AI in business across industries for years. We see the tectonic shift happening before our eyes as fascinating and disruptive. The idea of great alignment of human intent with AI systems has materialized, and this principle will expand even further, bringing both progress and controversies. More than ever, we want to share how important it is to understand the new technology to continue to deliver successful business solutions. All industries can benefit from tailored AI systems that align businesses with customer desires.
What do you think, how revolutionary will ChatGPT prove to be, and why? Let us know what interests you about applying ChatGPT or similar technologies, and don’t forget to follow us on LinkedIn for new updates!
In previous posts from this series, we discussed how we formatted UWSGI and Python logs in JSON format. We still have one important production component left: the Nginx server. This blog post will describe how the Nginx logging module works, and showcase a simple logging configuration where Nginx logger is configured to output JSON logs.
What is the point in collecting logs and metrics if you don’t use them? In this blog post, we will build upon our previous blog post and connect Fluent Bit log collectors to Elasticsearch along with a basic setup and comparison of Kibana and Grafana, tools often used for visualizing logs
We build AI for your needs.
Meet our highly experienced team, who loves to build AI and design its surroundings to incorporate it in your business. Find out how much you can benefit from our fair and open approach.