Deep Learning

Introduction to Generative AI

Generative AI has made tremendous strides recently, from models like Stable Diffusion to ChatGPT. Get up to speed on the latest advancements with this easy-to-follow introduction to Generative AI.

Introduction to Generative AI

The term “Generative AI” has appeared as if out of thin air over the past few months. Looking at Google trends, we can see an aggressive growth in interest even just over the past 12 months.

This interest can be attributed to the release of Generative models like DALL-E 2, Imagen, and ChatGPT. But what does “Generative AI” actually mean?

In this article, part of our Everything you need to know about Generative AI series, we will provide an overview of the topic from the ground up. Explanations will span all experience levels so that all readers can better understand how this technology works as it becomes more integrated into our daily lives.

What is Generative AI?

In short, Generative AI refers to any Artificial Intelligence model that generates novel data, information, or documents.

For example, many businesses record their meetings - both in person and virtual. Here are a few ways Generative AI can extract value from these recordings:

  1. It can automatically generate a list of action items to ensure that meetings have actionable value.
  2. It can generate a summary of the meeting for people who couldn’t make it to the meeting, distilling the important information down to improve efficiency.
  3. It can generate context-relevant answers to questions that come up during the meeting.
Generative AI can be used to automatically generate useful documents during or after a meeting

Generative AI can be applied in other domains too. If we are in the process of making a video game, we can use AI to generate character art to inspire our creative process, or generate character animations to yield natural motions that both make our game immersive and free up developers for other tasks.

World concept art and character art for a video game, created in less than a minute using Stable Diffusion

The potential applications of Generative AI are numerous and diverse, so it becomes difficult to discuss them all together without narrowing our scope to a specific domain. In what follows, we’ll instead provide a general definition of Generative AI, followed by an examination of its value proposition in this more general context.

Generative AI vs. Discriminative AI

Generative AI is most easily described by contrasting it with Discriminative AI. As we have seen, Generative AI is useful when we want to generate data, information, or documents. On the other hand, Discriminative AI is useful when we want to make some sort of decision.

For example, if we are in the healthcare sector we may want to predict whether someone is at risk for cancer given some biometric data - height, weight, smoking history, blood pressure, etc. We want to use this information to decide whether or not this person is at risk for cancer.

Given the input data, the Discriminative AI model makes a decision as to whether or not this individual is at risk for cancer

We can similarly make decisions using other kinds of data too. Instead of a list of numbers as above, we might have an image. For example, we may have a radiological image and have the goal of determining whether it contains an image of a cancerous tumor.

Discriminative AI can use different types of input data. In this case, the input is an image (modified from source).

Of course, we can use Discriminative AI in other domains as well. Instead of the health sector, perhaps we are in the banking industry and want to determine, given transaction history, whether someone's identity has been compromised or stolen. We may try to identify suspicious transactions using Discriminative AI.

Discriminative AI works in other domains too, like in the banking sector

The given data - height, blood pressure, transaction history, etc. - can be referred to as features. When working with Discriminative AI, we don’t care about these features in and of themselves - we only care about them insofar as they help us make a decision.

In contrast, with Generative AI, we do care about these features themselves. Indeed, the whole goal of Generative AI is to understand how these features relate in order to generate plausible data. For example, suppose our goal is to generate a representative sample of humans in terms of body size (considering only height and weight here for simplicity). Then, the sample illustrated below is not very realistic:

Such unlikely heights and weights form even less likely combinations, and an even less likely sample together as a triplet

In particular, it is unlikely to have someone that tall and thin, or that short and wide; and it’s even less likely to have a sample of 3 such extremes simultaneously. Instead, we need to model the statistical distribution of weight and height in the population we wish to sample from, in order to generate more realistic novel data, like this:

Considering only height and weight, this sample of males is much more realistic than the previous sample

The next short section will be lightly technical and construct a loose mathematical framework around this concept. Feel free to skip this next section and jump ahead if the concepts or verbiage are too unfamiliar.

Jump ahead

In mathematical terms

Oftentimes, Discriminative AI is considered to be modeling a conditional distribution, whereas Generative AI is considered to be modeling a joint distribution.

Discriminative AI (left) finds a conditional distribution or decision boundary in the space, whereas Generative AI (right) models the joint distribution

Aside - conditional and joint distributions

Conditional distributions give the probability of different events occurring conditioned on a fact. For example, we can ask what the probability of rolling a specific number on a 6 sided die is. For a fair die, the probability will be ⅙ for all numbers on the die. Now, consider how these probabilities change when we are given extra information. What if we are told that the number we rolled was even? With this information, the probabilities the number being a 1, 3, or 5 are now zero, and the probabilities of rolling a 2, 4, or 6 are now .


On the other hand, joint distributions give the probability of multiple events occurring simultaneously. For example, what is the probability of rolling a 2 on a first die roll and a 3 on a second die roll? In this case, the probability is 1/36 assuming, again, that the die is fair.


To summarize, a conditional distribution provides the probability of A occurring given B, and a joint distribution provides the probability of A occurring alongside B.

This definition is not rigorous. Note in particular that not all Discriminative AI techniques model a conditional distribution because not all Discriminative AI methods even model a distribution in the first place. For example, Support Vector Machines are not probabilistic, but they are still used for Discriminative AI by finding a decision boundary in the space.

On the other hand, with Generative AI it is generally safe to say that we are modeling a joint distribution because the distribution itself is the object of interest. Once we model the distribution, we can use it in different ways. We can perform density estimates, for example estimating the probability of someone being taller than 71 in (180 cm) and lighter than 150 lbs (68 kg).

Alternatively, we can sample from this distribution to generate novel data, which we can do for various reasons. One reason could be to use the generated data to train another AI model. Another reason might be to use the generated data in its own right (as we do with models like DALL-E 2). It is this second reason that people are usually referring to when they talk about “Generative AI” colloquially with respect to the recent progress that has earned public attention.

In any event, when we are talking about Generative AI from a mathematical standpoint, we are generally talking about modeling a joint distribution.

The value of Generative AI

We have seen how Generative AI can be used in a straightforward way, say as artistic inspiration for world or character models for a video game.

Beyond these obvious creative use-cases, Generative AI can be thought of in an alternative, more abstract way that is more helpful at a conceptual level. In particular, perhaps the most general way to think about Generative AI is as a mapping from (potential) antecedents to desired consequents. Let’s take a look at what this means in more detail.

When developing a project, product, or business, we usually have some defined goals. To reach a goal, we usually have measurable outcomes that serve as a proxy for that goal. For example, let’s say that our goal is to become the preeminent brand in the X market. Let’s say that a Y% increase in shares of our product X on social media is our measurable outcome that serves as a good proxy for measuring progress towards our goal. How might we go about attaining that Y% increase?

The measurable outcome of increased social shares entails progress towards our goal in this scenario

In general, we may have many ideas for how to achieve this outcome. Ultimately, what we are doing is seeking to implement some change or idea (an antecedent) that will lead to the desired outcome (the consequent).

We may have many ideas (antecedents) for potential initiatives that could lead to our desired outcome (the consequent)

Oftentimes, we don’t know if an idea is actually an antecedent. That is, will the implementation of this idea actually lead to the desired consequent? Therefore it is our job to investigate, implement, and iterate on multiple potential antecedents in an attempt to observe the desired consequent. We must establish, to the best of our ability, that an idea is in fact an antecedent to the desired consequent - i.e. that the change entails the consequent.

Establishing entailment means that a particular antecedent leads to the desired consequent, which in turn leads to our main goal

Note

While we potentially may not know for certain that a given change is an antecedent a priori, we may have great confidence in a particular change or idea and therefore already be dedicated to implementing it. It is important to note that the usefulness of Generative AI as discussed below is still maintained in this more simplified scenario.

The critical point here is that the implementation details of the potential antecedents are usually the bottleneck in this process. It is very easy to think of what outcome we want and to think of what change might lead to that result, but the details of how we bridge the two is the challenge. Regardless of the specific domain, the human implementation details are where the bulk of the work is to be done and where the bottleneck occurs.

Human implementation is generally the bottleneck in our ability to test and/or implement various ideas

In this situation, Generative AI can be thought of as a tool to build the bridge between the potential antecedents and the desired consequent. We can use Generative AI in order to expedite many of the implementation details involved with putting these ideas into action:

Generative AI can be used to greatly expedite the implementation of our ideas

A simple example

Let’s see how this works by taking a specific example from our above scenario. In particular, let’s look at idea B, which is to add a button to our product page to easily share the product on social media. Remember, our desired outcome is to achieve a Y% increase in social shares, so this change seems like a good candidate as a potential antecedent to this outcome.

From a technical standpoint, implementing this change (i.e. adding this button) can take a bit of time. We may have to design a component that stylistically matches our website, we may have to incorporate it into a template so it shows up on multiple pages, we have to explore how the APIs of external social media platforms work, we may have to require login first or account creation, which would require interacting with a database. Generative AI can expedite all of these tasks.

We can use Generative AI to write the (relatively straightforward) code to make this happen. Below, I’ve asked ChatGPT to write a simple Next.js component that implements these social share buttons:

The main human work that would need to be done is review. Note that the model even gives an explanation of the code, which would make the review time even quicker. We can go further and ask the model to add styling to our component using some specific tools in our stack:

Here is the result:

import { useRouter } from 'next/router';
import { makeStyles } from '@material-ui/core/styles';
import Button from '@material-ui/core/Button';

const useStyles = makeStyles((theme) => ({
  button: {
    margin: theme.spacing(1),
    boxShadow: 'none',
    '&:hover': {
      boxShadow: theme.shadows[4],
    },
  },
}));

const ShareButtons = () => {
  const router = useRouter();
  const pageUrl = `https://yourwebsite.com${router.asPath}`;
  const classes = useStyles();

  const shareOnTwitter = () => {
    window.open(`https://twitter.com/intent/tweet?url=${pageUrl}`);
  };

  const shareOnFacebook = () => {
    window.open(`https://www.facebook.com/sharer/sharer.php?u=${pageUrl}`);
  };

  const shareOnLinkedIn = () => {
    window.open(`https://www.linkedin.com/sharing/share-offsite/?url=${pageUrl}`);
  };

  return (
    <div>
      <Button variant="contained" color="primary" className={classes.button} onClick={shareOnTwitter}>
        Share on Twitter
      </Button>
      <Button variant="contained" color="primary" className={classes.button} onClick={shareOnFacebook}>
        Share on Facebook
      </Button>
      <Button variant="contained" color="primary" className={classes.button} onClick={shareOnLinkedIn}>
        Share on LinkedIn
      </Button>
    </div>
  );
};

export default ShareButtons;

With solid domain knowledge and good prompt engineering principles, a human can use Generative AI to significantly truncate the time required to implement such a feature.

Of course, this is a simple example to communicate the essential idea outlined above. The actual task is not very complicated, but that’s the point. Much of the work that leads to valuable business outcomes is not incredibly complicated, and Generative AI can be used to greatly expedite the implementation of these changes when wielded by a competent user.

The bottom line is that Generative AI makes it fast and easy to implement ideas.

Final note

Further, we must remember that, for all future time, we are currently at a lower bound for how helpful Generative AI is. As models become more competent, the reach of their potential applications and the depth of their helpfulness can only increase. Additionally, as bespoke integrations for Generative AI models are developed for traditional tools, Generative AI will become tightly interwoven with our workflows, further compounding their impact.

Modern Generative AI

Generative AI is not a new technology, but the recent explosion in performance and interest can be attributed to advances made in the last 5 years or so.

In the image space (models like DALL-E 2, Imagen, Stable Diffusion, etc.), advances have relied primarily on the development of Diffusion Models.

In the language space (models like ChatGPT, GPT-4, etc.), advances have been made primarily by the scaling of the Transformer architecture.

All of these branches are generally founded on their own set of research. For the remainder of our Everything you need to know about Generative AI series, we will look at each domain in turn, looking at one topic per article. We will assume no prior knowledge beyond this introduction in order to help readers at all experience levels with AI better understand the models that are making their way into our lives. Each article will be self-contained, so you can pick and choose the topics that interest you as you please.

If you enjoyed this article, make sure to follow our newsletter to be notified when our next article drops.