#1: Building AI products 101 - LLMs
Here's the summary of my learnings after I've explored AI tools to see what is possible, what kind of products a single person could build and possibly monetize!
Welcome to the #1 issue of this newsletter 👋
What are large language models (LLMs)?
There is a concept of “next word prediction” that you likely use if you have predictive text turned on your iPhone. Large language models (LLMs) are basically highly sophisticated models that leverage this concept and are trained on internet data (think blogs, forums, StackOverflow, and more)
In this issue I’ve focused only on text-to-text models, however there are many others e.g. text-to-image, text-to-video, text-to-speech, speech-to-text to name a few.
There two most most popular LLMs are GPT 3.5/4 by OpenAI and Llama by Meta (Llama is open-source). You can interact with GPT using ChatGPT UI or OpenAI API. For Llama you can use Huggins Face which is like GitHub, but for AI models – see Llama demo or use the model hosted there using code. You can also download the Llama 2 model and run it offline (size vary from 4 – 129 GB) e.g. by using llama.cpp and just 7 lines of code.
Embeddings
Now, the fun part! While LLMs are highly generalized, what if you want to give them specific knowledge e.g. about your organisation? You wont fit the the whole knowledge base of your company into a prompt. And this is exactly what embeddings allow to overcome.
To create embedding means to convert a piece of text into a vector (array of numbers) which will hold the semantic meaning of this text. This allows to do similarity search, recommendation, classification etc.
Example: how chat with PDF works under the hood?
Create embeddings from a PDF by first dividing it into chunks. OpenAI’s
text-embedding-ada-002
embedding model has max 8191 input tokens (around 3000 pages PDF).
Check out OpenAI Tokenizer to visualize how word count translates to tokens.(Optional) For very large PDFs, save embeddings in a vector database like Pinecone or Postgres with pgvector extension.
When the user asks a question, create embedding from it as well. Now you can easily perform a semantic search to find parts of the PDF that are the most relevant (similar) to what user has asked.
Create a prompt to ask LLM (e.g. GPT-3.5) that consists of context - search results from step 3. and the question user has asked.
Done!
(Optional) Add a memory to the conversation, so that chat remembers past messages. This is simply including “past messages” as a context to each subsequent question asked by the user.
Damon Chen grew his “chat with PDF” app - PDF.ai to $8,333 MRR in 2 months read more…
Chaining
In the "chat with PDF" example, there is a sequence of instructions like preparing the data or interacting with the model. Majority of AI product have it and they are called chains. For this reason LangChain was created, it’s a framework that helps you create and play with chains easily by using components which will keep your app modular. For example your could quickly replace one model or component with the other.
Example LangChain use cases: QA over Documents, Code Understanding, Summarization and many more.
Product ideas
I remember Sam Altman saying in this interview that the biggest business opportunities and value (and yes, money) lie in the “middle layer”, which he describes as base-model e.g. GPT-3.5, tuned to specific industries e.g. medicine. So, companies can create their own, private chats that have specific knowledge built-in, then possibly share it with others and charge for it.
Another point to make is that many AI product don’t have moat. That’s because most of them rely on publicly available models, instead of training their own. In other words it’s easy to copy them. So the competitive advantage often comes down to whoever is the first to build the product, or has better distribution channels (existing customers, followers etc.)
Idea 1: Company knowledge-base chat
In many organisations the knowledge is scattered around, so being able to chat will all sources (think Notion, Confluence, Slack, Google calendar) would be great. For example, a user may ask:
“What is the process to do X”, and the answer could be:
“Here is the Notion link explaining the process, but recently Greg and Sam discussed it on #process-x on Slack and they have a meeting about it scheduled for tomorrow, reach out to them for more info”.
Other question: “Summarize what happened in team XYZ during my time off“.
Allow hosting it on-premise, so data never leaves company servers.
Idea 2: Browser extension: Code explainer
Go to GitHub, select a piece of code and let the extension explain it in plain English. Allow to ask additional questions e.g. about specific line of code. Additionally, it could translate it to a different programming language. Great for learning!
Idea 3: Wonderful web browser
A browser that opens any website, especially blogs or news, and shows them in a good looking, reader-like layout, without ads and summarized. 🤤
Idea 4: Monitor your brand
Let’s say you own a well known company or your are a well know person. This app will search for mentions of this brand everyday in Google and various social media platforms. Then will show frequency and sentiment of those mentions.
Code implementation
Python is the go-to language for anything related to AI, machine learning and data science. But I don’t know Python 🙈 So, I’m happy that JavaScript (and TypeScript) is supported by OpenAI API, Hugging Face API and LangChain. This is awesome because it’s now possible to build the whole app: backend, frontend and the website with JS/TS only, yay!
That’s all!
Would you like to read about other AI models e.g. for image generation next time?
Share your thoughts and feedback in the comments.
See you next time! 👋
- Jarek