Large Language Models and
Where to Use Them: Part 1
Jul 7, 2022 • 8 min read
Over the past few years, large language models (LLMs) have evolved from emerging to mainstream technology. In this blog post, we'll explore some of the most common natural language processing (NLP) use cases that they can address. This is part one of a two-part series.
Large Language Models and Where to Use Them
You can find Part 2 here.
A large language model (LLM) is a type of machine learning model that can handle a wide range of natural language processing (NLP) use cases. But due to their versatility, LLMs can be a bit overwhelming for newcomers who are trying to understand when and where to use these models.
In this blog series, we’ll simplify LLMs by mapping out the seven broad categories of use cases where you can apply them, with examples from Cohere's LLM platform. Hopefully, this can serve as a starting point as you begin working with the Cohere API, or even seed some ideas for the next thing you want to build.
The seven use case categories are:
- Generate
- Summarize
- Rewrite
- Extract
- Search/Similarity
- Cluster
- Classify
Because of the general-purpose nature of LLMs, the range of use cases and relevant industries within each category is extremely wide. This post will not attempt to delve too deeply into each, but it will provide you with enough ideas and examples to help you start experimenting.
1. Generate
Probably the first thing that comes to mind when talking about LLMs is their ability to generate original and coherent text. And that’s what this use case category is all about. LLMs are pre-trained using a huge collection of text gathered from a variety of sources. This means that they are able to capture the patterns of how language is used and how humans write.
Getting the best out of these generation models is now becoming a whole field of study in and of itself called prompt engineering. In fact, the first four use case categories on our list all leverage prompt generation in their own ways.
More on the other three later, but the basic idea in prompt engineering is to provide a context for a model to work with. Prompt engineering is a vast topic, but at a very high level, the idea is to provide a model with a small amount of contextual information as a cue for generating a specific sequence of text.
One way to set up the context is to write a few lines of a passage for the model to continue. Imagine writing an essay or marketing copy where you would begin with the first few sentences about a topic, and then have the model complete the paragraph or even the whole piece.
Another way is by writing a few example patterns that indicate the type of text that we want the model to generate. This is an interesting one because of the different ways we can shape the models and the various applications that it entails.
Let’s take one example. The goal here is to have the model generate the first paragraph of a blog post. First, we prepare a short line of context about what we’d like the model to write. Then, we prepare two examples — each containing the blog’s title, its audience, the tone of voice, and the matching paragraph.
Finally, we feed the model with this prompt, together with the information for the new blog. And the model will duly generate the text that matches the context, as seen below.
You can test it out by accessing the saved preset.
In fact, the excerpt you read at the beginning of this blog was generated using this preset!
That was just one example, but how we prompt a model is limited only by our creativity. Here are some other examples:
- Writing product descriptions, given the product name and keywords
- Writing chatbot/conversational AI responses
- Developing a question-answering interface
- Writing emails, given the purpose/command
- Writing headlines and paragraphs
2. Summarize
The second use case category, which also leverages prompt engineering, is text summarization. Think about the amount of text that we deal with on a typical day, such as reports, articles, meeting notes, emails, transcripts, and so on. We can have an LLM summarize a piece of text by prompting it with a few examples of a full document and its summary.
The following is an example of article summarization, where we prepare the prompt to contain the full passage of an article and its one-line summary.
Prompt:
Completion:
You can test it out by accessing the saved preset.
Here are some other example documents where LLM summarization will be useful:
- Customer support chats
- Environmental, Social, and Governance (ESG) reports
- Earnings calls
- Paper abstracts
- Dialogues and transcripts
3. Rewrite
Another flavor of prompt engineering is text rewriting. This is another of those tasks that we do every day and spend a lot of time on, and if we could automate them, it would free us up to work on more creative tasks.
Rewriting text can mean different things and take different forms, but one common example is text correction. The following is the task of correcting the spelling and grammar in voice-to-text transcriptions. We prepare the prompt with a short bit of context about the task, followed by examples of incorrect and corrected transcriptions.
Prompt:
You can test it out by accessing the saved preset.
Here are some other example use cases for using an LLM to rewrite text:
- Paraphrase a piece of text in a different voice
- Build a spell checker that corrects text capitalizations
- Rephrase chatbot responses
- Redact personally identifiable information
- Turn a complex piece of text into a digestible form
4. Extract
Text extraction is another use case category that can leverage a generation LLM. The idea is to take a long piece of text and extract only the key information or words from the text.
The following is the task of extracting relevant information from contracts. We prepare the prompt with a short bit of context about the task, followed by a couple of example contracts and the extracted text.
Prompt:
Completion:
You can test it out by accessing the saved preset.
Some other use cases in this category include:
- Extract named entities from a document
- Extract keywords and keyphrases from articles
- Flag for personally identifiable information
- Extract supplier and contract terms
- Create tags for blogs
Conclusion
In part two of this series, we’ll continue our exploration of the remaining three use case categories (Search/Similarity, Cluster, and Classify). We’ll also explore how LLM APIs can help address more complex use cases. The world is complex, and a lot of problems can only be tackled by piecing multiple NLP models together. We’ll look at some examples of how we can quickly snap together a combination of API endpoints in order to build more complete solutions.
* * * * * * *
Large Language Models and
Where to Use Them: Part 2
Jul 7, 2022 • 8 min read
Over the past few years, large language models (LLMs) have evolved from emerging to mainstream technology. In this blog post, we'll explore some of the most common natural language processing (NLP) use cases that they can address. This is part one of a two-part series.
In Part 1 of our series, we covered the first four use case categories: Generate, Summarize, Rewrite, and Extractt. In this post, we will cover the other three: Search, Cluster, and Classify. Finally, we’ll look at how we can combine the different types, making their applications much more interesting and useful.
5. Search/Similarity
Any mention of LLMs will most likely spark discussion around their text generation capabilities, as we’ve seen in the previous four use cases. The less-talked-about, but equally powerful capability, is text representation.
While text generation is about creating new text, text representation is about making sense of existing text. Think about the amount of unstructured text data being generated today that’s only accelerated by the increasingly ubiquitous internet. It would not be possible for humans to process this massive volume of information without NLP-powered automation.
One such use case category for text representation is similarity search. Given a text query, the goal is to find documents that are most similar to the query.
The most obvious example use case for this is search engines. As users, we expect the search results to return links and documents that are highly relevant to our query. What makes modern search engines work very well is their ability to match the query to the appropriate results not just via keyword-matching, but by semantic similarity.
In simple words, they are able to perform matching based on meaning, context, themes, ideas — abstract concepts that may use different words altogether, but very much relate to each other.
Let’s say a user enters the search string “ground transportation at the airport.” The search engine must be able to know that the user is looking for taxis, car rentals, trains, or other similar services, even if the user doesn’t explicitly mention them.
When we input a piece of text into a representation model, instead of generating more text, the model generates a set of numbers that represent the meaning or context of the input text. These numbers are called “text embeddings”. In LLMs, they tend to be a very long sequence of numbers, typically in the thousands, and the longer they are, the more information is stored about the text.
With Cohere, you can access this type of model via the Embed endpoint. This Python notebook provides an example of a semantic search application, where given a question, the search engine would return other frequently asked questions (FAQ) whose text embeddings are the most similar to the question.
It goes on to show all the questions on a two-dimensional plot, shown in the image below, where the closer two points are on the plot, the more semantically similar they are.
This concept can be applied to a much broader range of use cases, for example:
- Retrieval of related and useful documents within an organization
- Similar product recommendations
- eCommerce product search
- Next article recommendations based on reading history
- Selecting chatbot responses from an available list
6. Cluster
Clustering is another use case category that leverages text embeddings. The idea is to take a group of documents and make sense of how they are organized and how they are related to each other.
Clustering is another use case category that leverages text embeddings. The idea is to take a group of documents and make sense of how they are organized and how they are related to each other.
In the previous use case, we visualized a set of documents on a plot to get a sense of how a set of documents are similar, or different, from each other. Clustering uses the same principles, but adds another step of organizing them into groups. This can be done via clustering algorithms, for example, k-means clustering, where we specify the number of clusters and the algorithm will return the appropriate cluster associated with each piece.
This Python notebook, also leveraging the Embed endpoint, goes into detail about how to make sense of three thousand “Ask HN” (Hacker News) posts. First, the text embeddings for each are generated. This is followed by clustering them into smaller groups by the theme or topic of the posts, supplemented by the keywords that represent the topic of each group.
Finally, these posts are visualized on a plot, shown in the image below, where one color represents a topic cluster. Below you can see a few topics emerging, such as life, career, coding, startups, and computer science.
This technique can be applied to number of different tasks, such as:
- Organizing customer feedback and requests into topics
- Segmenting products into categories based on product descriptions
- Turning ESG reports and news into themes
- Organizing a huge corpus of company documents
- Discovering emerging themes in survey responses analysis
7. Classify
Last but not least is the text classification category, and that’s because it is probably the most widely applicable use of NLP today. You can think of it as similar to clustering, with a slight twist.
Clustering is called an “unsupervised learning” algorithm. That’s because we don’t know what the clusters are beforehand — we assign a number of clusters (we can choose any number), and the algorithm will group the documents we give according to that number.
On the other hand, classification is a “supervised learning” algorithm, because this time, we already know beforehand what those clusters, or more precisely classes, are.
For example, say we have a list of eCommerce customer inquiries, and for routing purposes, we would like to categorize each of them into one of three classes: Shipping, Returns, and Tracking. To make the classifier work, we first need to train it by showing it enough examples of a piece of text, such as “Do you offer same day shipping?”, and its actual class, which in this case is Shipping.
With LLMs, there are a couple of possible approaches to doing this. The first is via text embeddings, demonstrated in this Python notebook. It shows an example of training a classifier using text embeddings. First, it generates the embeddings of each piece of text. Next, it uses these embeddings as the input for training the classifier. For this kind of setup, the number of training examples required will depend on the task, but typically it can range in the hundreds or even thousands.
The other approach is by leveraging “few-shot” classification. With this approach, we are leveraging prompt engineering to provide classification examples to the model. This has shown to work well with as few as five training examples per class, though it still depends on the kind of task we are working on. But this option allows us to build a working classifier when we don’t have many training examples — an all-too-common problem.
Here’s how we would build the eCommerce inquiries classifier with a few-shot approach. The following is a screenshot from the Cohere Playground, where we leverage the Classify endpoint to build a classifier.
First, we prepare the prompt containing examples of text-class pairs. With a minimum of five examples per class, and three classes, we give it a total number of fifteen examples.
Next, we add any number of inputs that we would like to classify — here we have two inputs as examples.
The list of inputs for the classifier to classify |
We can then trigger the classification, in which the model will output the predicted class for each input and the accompanying confidence level values, which indicate how confident the model is in its prediction of each class.
The predictions given by the classifier together with the confidence levels |
You can test it out by accessing the saved preset.
Some example areas where text classification can be useful include:
- Content moderation for toxic comments on online platforms
- Intent classification in chatbots
- Sentiment analysis on social media activity
- eCommerce product categorization
- Assigning customer support tickets to the right teams
Getting the best out of the Cohere API
Now that we’ve covered the seven main use case categories for LLMs, let’s consider how we can build really interesting applications — by stacking these different capabilities together. Let’s look at a few examples and start with a fun one.
Imagine that you are creating a chatbot that needs to have a certain voice or style. In our case, that bot happens to a pirate!
Let’s make it a game where people can enter a phrase, and the bot will decide whether the phrase is “pirate” enough. And if it’s not, the bot will even correct the phrase and turn it into pirate lingo!
This is actually something that our team has experimented with ourselves. But without going into the implementation details, to make it work, we had to first classify whether or not a phrase is acceptable pirate speak. If it’s not, then we put the phrase through a pirate paraphraser. We then compare the similarity between the generated phrase and the original phrase, and only if they are similar enough would the bot return the new phrase.
To make this happen, we made use of three use case categories: Classify, Rewrite, and Search/Similarity.
A summary flow of the pirate paraphraser |
A more serious example would be a chatbot that answers questions on a forum. Here’s one possible basic implementation. First, we implement a classification step to determine if a user has entered a question or just a general comment or chat. And if it’s a question, then we proceed to search for the question from our database that is the most similar to the query, so we can provide a relevant answer.
A summary flow of the question answering chatbot |
In another example, let’s say we are building an article recommendation system, where the goal is to provide a list of other articles most relevant to the one that a user is currently reading. This article demonstrates an example of implementing similarity search, classification, and extraction in a basic recommender.
A summary flow of the article recommender |
We can take it even further by combining these steps with other APIs. In a recent blog post, we describe how to build complete, fully playable Magic the Gathering cards with AI, combining the capabilities of Cohere’s API together with a text-to-image generation API.
A summary flow of the Magic the Gathering card generator |
Conclusion
With these examples, we are only just scratching the surface. The possibilities of using LLMs are limited only by our imagination. This is an exciting time where any developer and team, not just the big players anymore, can tackle some of the toughest NLP challenges by leveraging cutting-edge AI technologies that are made available via simple API calls.