Technology and Principles Behind ChatGPT: Invited Talk at FIDIT Rijeka

By Mladen Fernežir on May 2, 2023

Velebit AI’s Lead Data Scientist, Mladen Fernežir, held a lecture at the Faculty of Informatics and Digital Technologies in Rijeka about how ChatGPT works, and also what technological excitements lie ahead in front of us.

This talk is the culmination of our previous successful joint research work regarding language models. Enes Deumić, Applied Machine Learning Researcher, followed with another talk, sharing multiple tips and tricks about machine learning practices in the industry.

Faculty of Informatics and Digital Technologies Collaboration with Velebit AI

Velebit AI is happy to announce that we’ve signed the agreement of collaboration with FIDIT, bringing us closer in the areas of research, development, and education. We’ve had the pleasure to work with FIDIT on a long-term natural language understanding project. You can check the project description for more detail.

We developed subject-specific language models, followed by multiple supervised classification algorithms. The project built on the existing academic language models for Croatian language, CroSloEngual BERT and BERTić*, which we fine tuned in a self-supervised way to develop a custom domain model. Then, we explored multiple supervised classification approaches on combinations of tabular data and transformer features. The work resulted in two scientific papers, and all code and models are open-sourced. It was great to work with FIDIT on an academic research project, and we are looking forward to more future collaboration!

Technology and Principles Behind ChatGPT and Similar Models

There’s been a lot of talk about ChatGPT in the last couple of months. I also talked about it before at the Google Developer Group meetup in Zagreb, at two podcasts, and I wrote two blogs about it: “How Disruptive Is ChatGPT And Why?” and “Can ChatGPT Pass the Blade Runner’s Voight-Kampff Test?”.

In this Business Class invited talk for university students learning about AI, I again explained the key concepts that led to ChatGPT and similar models. In addition, I also talked about the most recent developments and some hot topics in both industry and academic research.

Here was the outline of the talk:

Introduction
ChatGPT Basics
Alignment Research
Challenges and Concerns
Language Models in Velebit AI
Current Outlook
Educational Resources

You can download the entire presentation, or you can always contact us for more details about implementing solutions that build on this technology for your needs.

I went over the basics of transformers and GPT architecture, which was already great at producing the next probable word, taking into account all the previous context and prompts from the user. It is the alignment part to human intent and values that came next, which gave birth to the Chat part and brought more value. Like previously for robots, where humans can give feedback whether a salto A was better than salto B, it is possible to modify the probability distribution of a model that produces the next probable word (GPT) to have that probability more aligned with the current dialog (ChatGPT).

And much more than aligning text to desired intentions and values, research and development is rapidly moving to alignment of agents to any human behavior. This is the disruptive part and current outlook that’s been happening in the last few weeks, with many startups and research companies offering new solutions.

Reinforcement learning from human feedback to learn the backflip. The Chat part added RLHF techniques to the GPT part, which already did great to generate probable text continuations, but now with more alignment to the current dialogue

Machine Learning in the Industry with Enes Deumić

I loved our trip to Rijeka for many reasons. My friend and ex-colleague Enes Deumić had a great invited talk about various specifics in applying machine learning in the industry. Enes is an experienced machine learning researcher and practitioner, and he had an engaging talk with the students.

It matters to understand the fundamentals well. It also matters to focus on many details related to data and model outputs that will go wrong if you don’t pay attention. I’ve seen it many times while working with Enes, and also in Velebit AI: it is the small details and diligence that are key to delivering value to various business stakeholders. Otherwise, it is very easy in data science and generally in AI development to come to the wrong conclusions.

Enjoying the city of Rijeka. From left to right: Enes Deumić, Sanda Martinčić - Ipšić, Ana Meštrović, Mladen Fernežir, Slobodan Beliga. Velebit AI's CEO Davor Aničić behind the scenes taking the photo.

How to Build AI Solutions and Products on Top of ChatGPT Technology

There’s been a lot of excitement in the last couple of weeks about developing various new solutions. There are new language models that practically come out every other day. At the same time, there is also confusion, and even fear of the new technology. The unknown can always seem scary. That’s why it matters to understand the capabilities well, but also the limitations.

Research and development is currently moving to add more ability to large language models, and to build more advanced agents that mimic any digital human behavior. At the same time, the models don’t have the advanced internal processes that humans have while solving similar tasks, but they are becoming ever-better at mimicking. It is possible to exhibit behavior that looks intelligent as if human, but without higher level processes, such as conscience or thinking.

We’ve already seen advanced agents play games, such as Starcraft or DOTA 2, picking their policies to reach the defined gaming rewards. Now, there is also the natural language interface to make such agents closer to humans.

The Anatomy of Autonomy: Why Agents are the next AI Killer App after ChatGPT

We’ve already seen that adding multiple prompts and chains of reasoning can improve chat results. There are also multiple ways to add external memory, as I’ve explained in my presentation. There’s also connection to the Internet for external knowledge, various external API calls as tools, and finally, orchestrations such as AutoGPT to mimic advanced digital goals that otherwise humans could do.

At the same time, it is still the little details that matter, as Enes also put it in his lecture. With more ability and complexity than ever, it is the research and engineering for the last mile of value that matters the most. To have safe and reliable systems, all components and pipeline outputs must be closely monitored, with advanced control systems separate from the agent doing the advanced work.

If you are interested in building such solutions, let’s chat!

Back