In January 2024, I had the opportunity to demystify the buzz around Large Language Models (LLMs) for my fellow software developers through a presentation titled “LLMs without the hype: How to leverage LLMs as a software developer.” My journey in the software development industry spans over 15 years, and in the recent months, I’ve been deeply involved in building applications powered by LLMs. My goal was to share my experiences and learnings, moving past the initial excitement to offer a grounded perspective on how these powerful tools can be practically applied in our daily work.

If you are mainly looking for the slides, click here.

The Essence of LLMs for Developers

The presentation kicked off by setting the right expectations about LLMs. Far from being magical solutions, I aimed to show that LLMs are sophisticated tools that, when leveraged correctly, can significantly amplify our capabilities as developers. The core objectives were to introduce the basics of LLMs, including their architecture, the concept of embeddings and vector stores, the innovative approach of Retrieval Augmented Generation (RAG), and some practical considerations for maintenance. My intention was to pave the way for developers to embark on creating their LLM-backed applications with a solid foundation.

A Practical Journey: Building InboxAgent

To anchor our exploration in a tangible project, I shared the development journey of InboxAgent, an application designed to help users achieve “inbox zero” with the aid of AI. I walked through how InboxAgent works, from checking emails daily to generating comprehensive reports that highlight urgent actions, provide summaries of important content, and give a general overview of the inbox. This example served as a practical illustration of applying LLMs to solve real-world problems.

Deep Dive into LLMs

A substantial part of my talk was dedicated to elucidating the inner workings of LLMs. Starting with the transformer model, introduced in 2017, I explained how these models are trained on extensive text data using self-supervised learning techniques. I addressed the technical challenges of running LLMs locally, such as model size and hardware requirements, and offered practical advice on starting with quantized models and CPU-only setups to mitigate these challenges.

Mastering Prompt Engineering

A key takeaway from my presentation was the critical role of prompt engineering in maximizing the effectiveness of LLMs. I discussed how crafting precise prompts is essential for eliciting accurate and relevant responses from LLMs, exploring techniques like prompt chaining, zero-shot, and few-shot prompting. I recommended resources for attendees to delve deeper into the art and science of prompt engineering.

Enhancing LLMs with Retrieval-Augmented Generation

I introduced the audience to Retrieval-Augmented Generation (RAG), a technique that enriches LLM prompts with data retrieved from external sources, thus enhancing the model’s responses. By integrating semantic search with RAG, we can significantly boost the performance and relevancy of LLM outputs in our applications.

Tools, Evaluation, and Keeping LLMs in Check

The presentation also covered the landscape of tools available for LLM orchestration and vector databases, providing a useful starting point for developers interested in exploring these technologies further. I emphasized the importance of beginning with an evaluation dataset and shared insights on evaluating and maintaining LLMs in production, focusing on the necessity of guardrails and continuous monitoring.

Wrapping Up

In concluding my talk, I revisited the key themes, emphasizing the actionable insights into training, running LLMs locally, prompt engineering, and the critical aspects of evaluation. My aim was not just to share knowledge but to encourage a deeper exploration of LLMs.

My presentation at the conference was an endeavor to cut through the hype surrounding LLMs, providing a practical roadmap for software developers to harness the power of these models in enhancing their applications and workflows.

For those keen on exploring the concepts discussed in greater depth, the full slide deck is available here.