Text Summarization for NLP: 5 Best APIs, AI Models, and AI Summarizers

In Natural Language Processing (NLP), Text Summarization models automatically shorten documents, papers, podcasts, videos, and more into their most important soundbites. The models are powered by advanced Deep Learning and Machine Learning research.

Product teams are integrating Text Summarization APIs and AI Summarization models into their AI-powered platforms to create summarization tools that automatically summarize calls, interviews, law documents, and more. These are sometimes referred to as AI summarizers.

In this article, we’ll discuss what exactly Text Summarization is, how it works, a few of the best Text Summarization APIs, and some of the top use cases for summarization.

What is Text Summarization for NLP?

In Natural Language Processing, or NLP, Text Summarization refers to the process of using Deep Learning and Machine Learning models to synthesize large bodies of texts into their most important parts. Text Summarization can be applied to static, pre-existing texts, like research papers or news stories, or to audio or video streams, like a podcast or YouTube video, with the help of Speech-to-Text APIs.

Some Text Summarization APIs provide a single summary for a text, regardless of length, while others break the summary down into shorter time stamps.

Say, for example, you wanted to summarize the 2021 State of the Union Address–an hour and 43-minute long video.

Using a Text Summarization API with time stamps like Auto Chapters, you might be able to generate the following summaries for key sections of the video:

1:45: I have the high privilege and distinct honor to present to you the President of the United States.

31:42: 90% of Americans now live within 5 miles of a vaccination site.

44:28: The American job plan is going to create millions of good paying jobs.

47:59: No one working 40 hours a week should live below the poverty line.

48:22: American jobs finally be the biggest increase in non defense research and development.

49:21: The National Institute of Health, the NIH, should create a similar advanced research Projects agency for Health.

50:31: It would have a singular purpose to develop breakthroughs to prevent, detect and treat diseases like Alzheimer's, diabetes and cancer.

51:29: I wanted to lay out before the Congress my plan.

52:19: When this nation made twelve years of public education universal in the last century, it made us the best educated, best prepared nation in the world.

54:25: The American Family's Plan guarantees four additional years of public education for every person in America, starting as early as we can.

57:08: American Family's Plan will provide access to quality, affordable childcare.

61:58: I will not impose any tax increase on people making less than $400,000.

67:34: He said the U.S. will become an Arsenal for vaccines for other countries.

74:12: After 20 years of value, Valor and sacrifice, it's time to bring those troops home.

76:01: We have to come together to heal the soul of this nation.

80:02: Gun violence has become an epidemic in America.

84:23: If you believe we need to secure the border, pass it.

85:00: Congress needs to pass legislation this year to finally secure protection for dreamers.

87:02: If we want to restore the soul of America, we need to protect the right to vote.

Additionally, other summarization models can break long audio, video, or text inputs into more succinct summaries, such as the bullets, paragraph, and gist examples based on this podcast below:

Inside Intercom: Josh Seiden

0:00

/1962.135438

Bullets

Josh Seiden and Brian Donohue discuss the topic of outcome versus output on Inside Intercom. Josh Seiden is a product consultant and author who has just released a book called Outcomes Over Output. Brian is product management director and he's looking forward to the chat.
The main premise of the book is that by defining outcomes precisely, it's possible to apply this idea of outcomes in our work. It's in contrast to a really broad and undefined definition of the word ”outcome”.
Paul, one of the design managers at Intercom, was struggling to differentiate between customer outcomes and business impact. In Lean Startup, teams focus on what they can change in their behavior to make their customers more satisfied. They focus on the business impact instead of on the customer outcomes. They have a hypothesis and they test their hypothesis with an experiment. They don't have to be 100% certain, but they need to have a hunch. There is a difference between problem-focused and outcome-focused approaches to building and prioritizing projects. For example, a company is working on improving the inbox search feature in their product. They hope it will improve user retention and improve the business impact of the change.
Product teams need to focus on the outcome of their work rather than on the business impact of their product. They need to be more aware of the customer experience and their relationship with their business.
As a business owner, you have to build a theory of how the business works. The more you know about your business as your business goes on, the more you can build a business model. The business model is reflected in roadmaps and prioritizations.
Josh's book is available on Amazon, in print, in ebook and in audiobook on Audible.com. Brian's advice for teams looking to change their way of working is to start small and to use retrospectives and improve your process as you try to implement this. Josh and Brian enjoyed their conversation.

Paragraph

Josh Seiden and Brian Donohue discuss the topic of outcome versus output on Inside Intercom. Josh Seiden is a product consultant and author who has just released a book called Outcomes Over Output. Brian is product management director and he's looking forward to the chat.

Headline

Josh Seiden and Brian Donohue discuss the topic of outcomes versus output on Inside Intercom.

Gist

Outcomes over output

How does Text Summarization Work?

A litany of text summarization methods have been developed over the last several decades, so answering how text summarization works doesn’t have a single answer. This having been said, these methods can be classified according to their general approaches in addressing the challenge of text summarization.

Perhaps the most clear-cut and helpful distinction is that between Extractive and Abstractive text summarization methods. Extractive methods seek to extract the most pertinent information from a text. Extractive text summarization is the more traditional of the two methods, in part because of their relative simplicity compared to abstractive methods.

Abstractive methods instead seek to generate a novel body of text that accurately summarizes the original text. Already we can see how this is a more difficult problem - there is a significant degree of freedom in not being limited to simply returning a subset of the original text. This difficulty comes with an upside, though. Despite their relative complexity, Abstractive methods produce much more flexible and arguably faithful summaries, especially in the age of Large Language Models.

Extractive Text Summarization methods

As mentioned above, Extractive Text Summarization methods work by identifying and extracting the salient information in a text. The variety of Extractive methods therefore constitutes different ways of determining what information is important (and therefore should be extracted).

For example frequency-based methods will tend to rank the sentences in a text in order of importance by how frequently different words are used. For each sentence, there exists a weighting term for each word in the vocabulary, where the weight is usually a function of the importance of the word itself and the frequency with which the word appears throughout the document as a whole. Using these weights, the importance of each sentence can then be determined and returned.

Graph-based methods cast textual documents in the language of mathematical graphs. In this schema, each sentence is represented as a node, where nodes are connected if the sentences are deemed to be similar. What constitutes “similar” is, again, a choice of different specific algorithms and approaches. For example, one implementation might use a threshold on the cosine similarity between TF-IDF vectors. In general, the sentences that are globally the “most similar” to all other sentences (i.e. those with the highest centrality) in the document are considered to have the most summarizing information, and are therefore extracted and put into the summary. A notable example of a graph-based method is TextRank, a version of Google’s pagerank algorithm (which determines what results to display in Google Search) that has been adapted for summarization (instead ranking the most important sentences). Graph-based methods may benefit in the future from advances in Graph Neural Networks.

Abstractive Text Summarization methods

Abstractive methods seek to generate a novel summary that appropriately summarizes the information within a text. While there are linguistic approaches to Abstractive Text Summarization, Deep Learning (casting summarization as a seq2seq problem) has proven extremely powerful on this front over the past several years. The invention of the Transformer has therefore had a profound impact in the area of Abstractive Text Summarization, as it did to so many other areas.

More recently, Large Language Models in particular have been applied to the problem of text summarization. The observation of Emergent Abilities in LLMs has proven that LLMs are capable agents across a wide variety of tasks, summarization included. That is, while LLMs are not trained directly for the task of summarization, they become competent general Generative AI models as they scale, leading to the ability to perform summarization along with many other tasks.

More recently, LLM-based summarization-specific approaches have been explored, using pre-trained LLMs with Reinforcement Learning from Human Feedback (RLHF), the core technique which evolved GPT into ChatGPT (e.g. here and here). This schema follows the canonical RLHF training approach, in which human feedback is used to train a reward model, which is then used to update an RL policy via PPO. In short, RLHF leads to an improved, more easily prompted model that tailors its output to human expectations (in this case, human expectations for what a “good” summary is).

The field of text summarization is still an ongoing field of research, and there are natural extensions that can be explored in light of the work that has already been done. For example, we might consider using Reinforcement Learning from AI Feedback (RLAIF) instead of RLHF, which has been demonstrated in the more general context to lead to improved performance.

Best APIs for Text Summarization

Now that we’ve discussed what Text Summarization for NLP is and how it works, we’ll compare some of the best Text Summarization APIs, AI summarizers, and AI Summarization models to utilize today. Note that some of these APIs support Text Summarization for pre-existing bodies of text, like a research paper, while others perform Text Summarization on top of audio or video stream transcriptions, like from a podcast or virtual meeting.

1. AssemblyAI’s Summarization Models

AssemblyAI is a Speech AI company building new AI systems that can understand and process human speech. The company’s AI models for Summarization achieve state-of-the-art results on audio and video. In addition, AssemblyAI has additional Summarization models built for specific industry use cases, including informative, conversational, and catchy. Summaries can be returned as bullets, gist, paragraph, or headline (see example above).

LeMUR, AssemblyAI’s framework for Large Language Models, can also help product teams process requests for custom summary formats.

In addition, AssemblyAI offers a Summarization model called Auto Chapters, which applies Text Summarization on top of the data from an audio or video stream, and supplies a time-stamped one paragraph summary and single sentence headline for each chapter. This process is a unique adaptation of Text Summarization to AssemblyAI.

AssemblyAI's AI models are used by top product teams in podcasts, telephony, virtual meeting platforms, conversational intelligence AI platforms, and more. The company also recently released Conformer-2, an AI model for automatic speech recognition trained on 1.1M hours of English audio data, which makes summaries generated from transcriptions first processed with Conformer-2 even more accurate and useful.

Here’s an example of AssemblyAI’s Summarization Model in action using this seven-minute YouTube video discussing Bias and Variance in Machine Learning.

AssemblyAI's Summarization Model Results:

Bias and Variance Explained

Bias and variants are two of the most important topics when it comes to 
data science. This video is brought to you by AssemblyAI and is part of 
our Deep Learning Explained series. AssemblyAI is a company that is 
making a state of the art speech to text API. You can grab a free API 
token using the link in the description.

Models with High Bias

Bias is the amount of assumptions your model makes about the problem it 
is trying to solve. Underfitting is when a model is underfitting. 
Fitting variants show us the sensitivity of the model on the training 
data. High variance means overfitting models with high flexibility tend 
to have high variance like decision trees.

Solutions for Model Overfitting

When a model is underfitting or overfitting, the first thing to do is 
to train it more or increase the complexity of the model. To deal with 
high variance you need to decrease the complexity or introduce more 
data to the training. Regularization on the other hand, reduces the 
complexity and lowers the variance of a model.

Let’s See You Next Week

Thanks for watching the video. If you liked it, give us a like and 
subscribe. We would love to hear about your questions or comments in 
the comments section below.

Test AssemblyAI's Summarization models for Free

2. plnia’s Text Summarization API

The plnia Text Summarization API generates summaries of static documents or other pre-existing bodies of text. In addition to Text Summarization, plnia also offers Sentiment Analysis, Keyword Extractor, Abusive Language Check, and more. Developers wishing to test plnia can sign up for a 10-day free trial; plans that include Text Summarization then start at $19 per month.

3. Microsoft Azure Text Summarization

As part of its Text Analytics suite, Azure’s Text Summarization API offers extractive summarization for articles, papers, or documents. Requirements to get started include an Azure subscription and the Visual Studio IDE. Pricing to use the API is pay-as-you-go, though prices vary depending on usage and other desired features.

4. MeaningCloud’s Automatic Summarization

MeaningCloud’s Automatic Summarization API lets users summarize the meaning of any document by extracting the most relevant sentences and using these to build a synopsis. The API is multilingual, so users can use the API regardless of the language the text is in. Those looking to test the API must first sign up for a free developer account and then pricing to use the API ranges from $0-$999+/month, depending on usage.

5. NLP Cloud Summarization API

NLP Cloud offers several text understanding and NLP APIs, including Text Summarization, in addition to supporting fine-tuning and deploying of community AI models to boost accuracy further. Developers can also build their own custom models and train and deploy them into production. Pricing ranges from $0-$499/month, depending on usage.

Text Summarization Tutorials

Want to try Text Summarization yourself? This video tutorial walks you through how to apply Text Summarization to podcasts.

Building Products with Summarization AI

Text Summarization is used across a wide range of industries and applications.

Use cases include:

Creating chapters for YouTube videos or educational online courses via video editing platforms.
Summarizing and sharing key parts of corporate meetings to reduce the need for mass attendance.
Automatically identifying key parts of calls and flagging sections for follow-up via revenue intelligence platforms.
Summarizing large analytical documents to ease readability and understanding.
Segmenting podcasts and automatically providing a Table of Contents for listeners.

Additional Resources: