Technology and Principles Behind ChatGPT: Invited Talk at FIDIT Rijeka
By Mladen Fernežir on May 2, 2023
This talk is the culmination of our previous successful joint research work regarding language models. Enes Deumić, Applied Machine Learning Researcher, followed with another talk, sharing multiple tips and tricks about machine learning practices in the industry.
Faculty of Informatics and Digital Technologies Collaboration with Velebit AI
Velebit AI is happy to announce that we’ve signed the agreement of collaboration with FIDIT, bringing us closer in the areas of research, development, and education. We’ve had the pleasure to work with FIDIT on a long-term natural language understanding project. You can check the project description for more detail.
We developed subject-specific language models, followed by multiple supervised classification algorithms. The project built on the existing academic language models for Croatian language, CroSloEngual BERT and BERTić*, which we fine tuned in a self-supervised way to develop a custom domain model. Then, we explored multiple supervised classification approaches on combinations of tabular data and transformer features. The work resulted in two scientific papers, and all code and models are open-sourced. It was great to work with FIDIT on an academic research project, and we are looking forward to more future collaboration!
Technology and Principles Behind ChatGPT and Similar Models
There’s been a lot of talk about ChatGPT in the last couple of months. I also talked about it before at the Google Developer Group meetup in Zagreb, at two podcasts, and I wrote two blogs about it: “How Disruptive Is ChatGPT And Why?” and “Can ChatGPT Pass the Blade Runner’s Voight-Kampff Test?”.
In this Business Class invited talk for university students learning about AI, I again explained the key concepts that led to ChatGPT and similar models. In addition, I also talked about the most recent developments and some hot topics in both industry and academic research.
Here was the outline of the talk:
- ChatGPT Basics
- Alignment Research
- Challenges and Concerns
- Language Models in Velebit AI
- Current Outlook
- Educational Resources
You can download the entire presentation, or you can always contact us for more details about implementing solutions that build on this technology for your needs.
I went over the basics of transformers and GPT architecture, which was already great at producing the next probable word, taking into account all the previous context and prompts from the user. It is the alignment part to human intent and values that came next, which gave birth to the Chat part and brought more value. Like previously for robots, where humans can give feedback whether a salto A was better than salto B, it is possible to modify the probability distribution of a model that produces the next probable word (GPT) to have that probability more aligned with the current dialog (ChatGPT).
And much more than aligning text to desired intentions and values, research and development is rapidly moving to alignment of agents to any human behavior. This is the disruptive part and current outlook that’s been happening in the last few weeks, with many startups and research companies offering new solutions.
Machine Learning in the Industry with Enes Deumić
I loved our trip to Rijeka for many reasons. My friend and ex-colleague Enes Deumić had a great invited talk about various specifics in applying machine learning in the industry. Enes is an experienced machine learning researcher and practitioner, and he had an engaging talk with the students.
It matters to understand the fundamentals well. It also matters to focus on many details related to data and model outputs that will go wrong if you don’t pay attention. I’ve seen it many times while working with Enes, and also in Velebit AI: it is the small details and diligence that are key to delivering value to various business stakeholders. Otherwise, it is very easy in data science and generally in AI development to come to the wrong conclusions.
How to Build AI Solutions and Products on Top of ChatGPT Technology
There’s been a lot of excitement in the last couple of weeks about developing various new solutions. There are new language models that practically come out every other day. At the same time, there is also confusion, and even fear of the new technology. The unknown can always seem scary. That’s why it matters to understand the capabilities well, but also the limitations.
Research and development is currently moving to add more ability to large language models, and to build more advanced agents that mimic any digital human behavior. At the same time, the models don’t have the advanced internal processes that humans have while solving similar tasks, but they are becoming ever-better at mimicking. It is possible to exhibit behavior that looks intelligent as if human, but without higher level processes, such as conscience or thinking.
We’ve already seen advanced agents play games, such as Starcraft or DOTA 2, picking their policies to reach the defined gaming rewards. Now, there is also the natural language interface to make such agents closer to humans.
We’ve already seen that adding multiple prompts and chains of reasoning can improve chat results. There are also multiple ways to add external memory, as I’ve explained in my presentation. There’s also connection to the Internet for external knowledge, various external API calls as tools, and finally, orchestrations such as AutoGPT to mimic advanced digital goals that otherwise humans could do.
At the same time, it is still the little details that matter, as Enes also put it in his lecture. With more ability and complexity than ever, it is the research and engineering for the last mile of value that matters the most. To have safe and reliable systems, all components and pipeline outputs must be closely monitored, with advanced control systems separate from the agent doing the advanced work.
If you are interested in building such solutions, let’s chat!
Recent blog posts
Color Detection API for Improving Product Discovery
Filtering by color is a feature we often use on fashion marketplaces, e-commerce sites, and any other sites where we buy clothes. Color is also important when buying furniture, and antiques, or searching for stock images.
How AI Can Improve UX Design on Mobile Apps
To deliver a smooth user experience, it's essential to stay current with technology in the realm of UX design. In this blog post, we'll look at how AI is becoming increasingly indispensable in web app development to give users a more tailored experience.
Velebit AI announces a partnership with Best Advisory
Velebit AI announces a partnership with Best Advisory, a Zagreb-based managerial consulting outlet focused on digital transformation consulting, strategic agility, and planning.