|||

Video Transcript

X

Building a GPT Backed by a Heroku-Deployed API

How to connect your GPT on OpenAI to a backend Node.js app

Late in 2023, OpenAI introduced GPTs, a way for developers to build customized versions of ChatGPT that can bundle in specialized knowledge, follow preset instructions, or perform actions like reaching out to external APIs. As more and more businesses and individuals use ChatGPT, developers are racing to build powerful GPTs to ride the wave of ChatGPT adoption.

Introducing GPTs

Source

If you’re thinking about diving into GPT development, we’ve got some good news: Building a powerful GPT mostly involves building an API that handles a few endpoints. And in this post, we’ll show you how to do it.

In this walk-through, we’ll build a simple API server with Node.js. We’ll deploy our API to Heroku for simplicity and security. Then, we’ll show you how to create and configure a GPT that reaches out to your API. This project is part of our Heroku Reference Applications GitHub organization where we host different projects showcasing architectures and patterns to deploy to Heroku.

This is going to be a fun one. Let’s do it!

Our GPT: An Employee Directory

Imagine your organization uses ChatGPT internally for some of its operations. You want to provide your users (employees) with a convenient way to search through the employee database. These users aren’t tech-savvy. What’s an SQL query anyway?

With natural language, our users will ask our custom GPT a question about employees in the company. For example, they might ask: “Who do we have in the marketing department that was hired in 2021?”

The end user doesn’t know (or care) about databases, queries, or result rows. Our GPT will send a request to our API. Our API will find the requested information and return a natural language response, which our GPT sends back to the end user.

Here’s how it looks:

GPT example

Pretty cool, right? The basic flow looks like this:

  1. In the ChatGPT interface, the user asks our GPT a question related to the employee directory.
  2. The GPT sends a POST request containing the user’s question to our API.
  3. Our API calls OpenAI’s Chat Completions API to help translate the user’s question into a well-formed SQL query.
  4. Our API uses the SQL query to fetch results from the employee database.
  5. Our API calls OpenAI’s Chat Completions API to process the query results into a natural language response.
  6. Our API passes this response back to the GPT.
  7. ChatGPT presents the response to the user.

Architecture diagram

Note: In the architecture above, all the data is leaving the Heroku trust boundary to access OpenAI services, take this into account when building data-sensitive applications.

Prerequisites and Initial Steps

Note: If you want to try the application first, deploy it using the “Deploy to Heroku” button in the reference application’s README file.

Before you can get started, you’ll need a few things in place:

  1. An OpenAI account. You’ll need to add a payment method and purchase a small amount of credit since you want access to its APIs.
  2. Once you have your OpenAI account set up, you’ll need to create a secret API key and copy it down. Your API application will need this key to authenticate its requests to the OpenAI API.
  3. A Heroku account. You’ll need to add a payment method to cover your compute and database costs. For building and testing this API, we recommend using an eco dyno, which has a $5 monthly flat fee. It’ll supply you with more than enough hours for initial development. You’ll also need Heroku Postgres. You can use the Mini plan, at $0.007/hour, which is enough for this application.
  4. A GitHub account for your code repository. Heroku will hook into your GitHub repo directly, simplifying deployment to a single click.
  5. Clone the GitHub repo with the code for the API application.

Note: Every request incurs costs and the price varies depending on the selected model. For example, using the GPT-3 model, in order to spend $1, you'd have to ask more than 20,000 questions. See the OpenAI API pricing page for more information.

The README in the repo has all the instructions you need to get the API server deployed to Heroku. If you just want to get your GPT up and running quickly, skip down to the Create and Configure GPT section Otherwise, you can follow along to walk through how to build this API.

We used Node v20.10.0 and yarn as our package manager. Install your dependencies.

yarn install

Build the API

One of the most powerful ways to use OpenAI’s custom GPTs is by building an API that your GPT reaches out to. Here’s how OpenAI’s blog post introducing GPTs describes it:

In addition to using our built-in capabilities, you can also define custom actions by making one or more APIs available to the GPT… Connect GPTs to databases, plug them into emails, or make them your shopping assistant. For example, you could integrate a travel listings database, connect a user’s email inbox, or facilitate e-commerce orders.

So, even though we’re building a GPT, under the hood we are simply building an API. For this, we use Express and listen for POST requests to the /search endpoint. We can build and test our API as a standalone unit before creating our GPT and custom action.

Let’s look at src/index.js for how our server will handle POST requests to /search. To keep our code snippet easily readable, we’ve left out the logging and error handling:

server.post('/', authMiddleware, async (req, res) => {
  …
  const userPrompt = req.body.message
  const sql = await AI.craftQuery(userPrompt)
  let rows = []
  …
  rows = await db.query(sql)
  …
  const results = await AI.processResult(userPrompt, sql, rows)
  res.send(results)
})

As you can see, the major steps we need to cover are:

  1. Ask OpenAI to craft an SQL query.
  2. Query the database.
  3. Ask OpenAI to turn the query results into a natural language response.

Using OpenAI’s Chat Completions API

Because our API will need to do some natural language processing, it will make some calls to OpenAI’s Chat Completions API. Not every API needs to do this. Imagine a simple API that just needs to return the current date and time. It doesn’t need to rely on OpenAI for its business logic.

But our GPT’s supporting API will need the Chat Completions API for basic text generation.

The first call to OpenAI: generate an SQL query

As per our flow (see the diagram above), we’ll need to ask OpenAI to convert the user’s original question into an SQL query. Let’s look at src/ai.js to see how we do this.

When sending a request to the Chat Completions API, we send an array of messages to help ChatGPT understand the context, including what’s being requested and how we want ChatGPT to behave in its response. Our first message is a system message, where we set the stage for ChatGPT.

const PROMPT = `
  I have a psql db with an "employees" table, created with the following statements:

  create type department_enum as enum('Accounting','Sales','Engineering','Marketing','Product','Custom
er Service','HR');

  create type title_enum as enum('Assistant', 'Manager', 'Junior Executive', 'President', 'Vice-President', 'Associate', 'Intern', 'Contractor');

  create table employees(id char(36) not null unique primary key, first_name varchar(64) not null, last_name varchar(64) not null, email text not null, department department_enum not null, title title_enum not null, hire_date date not null);
`.trim()

const SYSTEM_MESSAGE = { role: 'system', content: PROMPT }

Our craftQuery function looks like this:

const craftQuery = async (userPrompt) => {
  const settings = {
    messages: [SYSTEM_MESSAGE],
    model: CHATGPT_MODEL,
    temperature: TEMPERATURE,
    response_format: {
    type: 'json_object'
    }
  }

  settings.messages.push({
    role: 'system',
    content: 'Output JSON with the query under the "sql" key.'
  })

  settings.messages.push({
    role: 'user',
    content: userPrompt
  })
  settings.messages.push({
    role: 'user',
    content: 'Provide a single SQL query to obtain the desired result.'
  })

  logger.info('craftQuery sending request to openAI')

  const response = await openai.chat.completions.create(settings)
  const content = JSON.parse(response.choices[0].message.content)
  return content.sql
}

Let’s walk through what this code does in detail. First, we put together the set of messages that we’ll send to ChatGPT:

  1. The initial system message that lays out how we have structured our database, so that ChatGPT knows column names and constraints when crafting a query.
  2. A system message that tells ChatGPT the format/structure we want for the response. In this case, we want the response as JSON (not natural language), with the SQL query under the key called sql.
  3. A user message, which is the end user’s original request.
  4. A follow-up user message, where we specifically ask ChatGPT to generate a single SQL query for us, based on what we’re looking for.

We use the openai package (not shown) for Node.js. This is the official JavaScript library for OpenAI, serving as a convenient wrapper around the OpenAI API. With our settings place, we call the create function to generate a response. Then, we return the sql statement (in the JSON object) from OpenAI’s response.

Use SQL to query the database

Back in src/index.js, we use the SQL statement from OpenAI to query our database. We wrote a small module (src/db.js) to handle connecting with our PostgreSQL database and sending queries.

Our call to db.query(sql) returns the query result, an array called rows.

The second call to OpenAI: process the query results

Although our API could send back the raw database query results to the end user, it would be a better user experience if we turned those results into a human-readable response. Our user doesn’t need to know that there was a database involved. A natural language response would be ideal.

So, we’ll send another request to the Chat Completions API. In src/ai.js, we have a function called processResult:

const processResult = async (userPrompt, sql, rows) => {
  const settings = {
    messages: [SYSTEM_MESSAGE],
    model: CHATGPT_MODEL,
    temperature: TEMPERATURE
  }

  const userMessage = `
    This is how I described I was looking for: ${userPrompt}

    This is the query sent to find the results: ${sql}

    Here is the resulting data that you found:
    ${JSON.stringify(rows)}

Assume I am not even aware that a database query was run. Do not include the SQL query in your response to me. If the original request does not explicitly specify a sort order, then sort the results in the most natural way. Return the resulting data to me in a human-readable way, not as an object or an array. Keep your response direct. Tell me what you found and how it is sorted.'
  `
  settings.messages.push({
    role: 'user',
    content: userMessage
  })

  logger.info('processResult sending request to openAI')

  const response = await openai.chat.completions.create(settings)
  return response.choices[0].message.content
}

Again, we start with an initial system message that gives ChatGPT information about our database. At this point, you might ask: Didn’t we already do that? Why do we need to tell ChatGPT about our database structure again? The answer is in the Chat Completions API documentation:

Including conversation history is important when user instructions refer to prior messages…. Because the models have no memory of past requests, all relevant information must be supplied as part of the conversation history in each request.

Along with the database structure, we want to provide ChatGPT with some more context. In userMessage, we include:

  1. The user’s original question (userPrompt), so ChatGPT knows what question it is ultimately answering.
  2. The sql query that we used to fetch the results from the database.
  3. The database query results (rows).
  4. Clear instructions about what we want ChatGPT to do now—that is, “return the resulting data to me in a human-readable way” (along with some other guidelines).

Similar to before, we send these settings to the create function, and then pass the response content up to the caller.

Other implementation details (not shown)

The code snippets we’ve shown cover the major implementation details for our API development. You can always take a look at the GitHub repo to see all the code, line by line. Some details that we didn’t cover here are:

  • Creating a PostgreSQL database with an employees table and populating it with dummy data. See the data/create_schema.sql and data/create_records.sql for this.
  • Implementing bearer auth for our API (see src/auth.js). Requests to our API must attach an API key that we generate. We store this API key as an environment variable called BEARER_AUTH_API_KEY. We’ll discuss this lower down when configuring our GPT.
  • Writing basic unit tests with Jest.
  • ESLint and Prettier configurations to keep our code clean and readable.

Testing our API’s business logic

With all of our code in place, we can test our API by sending a POST request, just like our GPT would send a request when a user makes a query. When we start our server locally, we make sure to have a .env file that contains the environment variables that our API will need:

  • OPENAI_API_KEY: The openai JavaScript package uses this to authenticate requests we send to the Chat Completions API.
  • BEARER_AUTH_API_KEY: This is the API key that a caller of our API will need to provide for authentication.
  • DATABASE_URL: The PostgreSQL connection string for our database.

An example .env file might look like this:

OPENAI_API_KEY=sk-Kie************************************************
BEARER_AUTH_API_KEY=thisismysecretAPIkey
DATABASE_URL=postgres://db_user:db_pass@localhost:5432/company_hr_db

We start our server:

node index.js

In a separate terminal, we send a curl request to our API:

curl -X POST \
  --header "Content-type:application/json" \
  --header "Authorization: Bearer thisismysecretAPIkey" \
  --data "{\"message\":\"Please find names and hire dates of any employees in the marketing department hired after 2018. Sort them by hire date.\"}" \
  http://localhost:3000/search

I found the names and hire dates of employees in the marketing department who were hired after 2018. The data is sorted by hire date in ascending order. Here are the results:

- Jailyn McClure, hired on 2019-02-21
- Leopold Johnston, hired on 2019-02-21
- Francis Kris, hired on 2019-10-09
- Jerad Strosin, hired on 2019-10-22
- Daniela Boehm, hired on 2020-05-25
- Joe Torp, hired on 2020-05-31
- Harry Heaney, hired on 2020-08-16
- Anabel Sporer, hired on 2020-12-22
- Carson Gislason, hired on 2020-12-25
- Bud Farrell, hired on 2021-05-04
- Katelynn Swaniawski, hired on 2021-07-13
- Ernesto Baumbach, hired on 2021-08-15
- Gwendolyn DuBuque, hired on 2021-10-10
- Willow Green, hired on 2021-11-20
- Rodrigo Fay, hired on 2022-07-04
- Makayla Crooks, hired on 2022-08-02
- Gerry Boehm, hired on 2022-09-28
- Gretchen Mertz, hired on 2023-02-15
- Chloe Bayer, hired on 2023-03-30
- Alek Herman, hired on 2023-05-25
- Eloy Flatley, hired on 2023-08-25
- Zackery Welch, hired on 2023-09-08

Our API works as expected! It interpreted our request, queried the database successfully, and then returned results in a human-readable format.

Now it’s time to create our custom GPT.

Deploy to Heroku

First, we need to deploy our API application to Heroku.

Step 1: Create a new Heroku app

After logging in to Heroku, go to the Heroku dashboard and click Create new app.

Create new app

Provide a name for your app. Then, click Create app.

Provide a name for the app

Step 2: Connect your Heroku app to your project repository

With your Heroku app created, connect it to the GitHub repository for your project.

Connect to GitHub

Step 3: Add Heroku Postgres

You’ll also need a PostgreSQL database running alongside your API. Go to your app’s Resources page and search the add-ons for “postgres.”

Add Heroku Postgres addon

Select the “Mini” plan and submit the order form.

Select Heroku Postgres plan

Step 4: Set up app config vars

You’ll recall that our API depends on a few environment variables (in .env). When deploying to Heroku, you can set these up by going to your app Settings, Config Vars. Add a new config var called OPENAI_API_KEY, and paste in the value you copied from OpenAI.

Notice that Heroku has added a DATABASE_URL config var based on your Heroku Postgres add-on. Convenient!

Finally, you need to add a config var called BEARER_AUTH_API_KEY. This is the key that any caller of our API (including ChatGPT, through our custom GPT’s action) will need to provide for authentication. You can set this to any value you want. We used an online random password generator to generate a string.

Configuring environment variables

Step 5: Seed the database

Don’t forget to seed your newly running Heroku Postgres database with the dummy data. Assuming you have the Heroku CLI installed, accessing your database add-on is incredibly convenient. Set up your database with the following:

heroku pg:psql < create_schema.sql
heroku pg:psql < create_records.sql

Step 6: Deploy

Go to the Deploy tab for your Heroku app. Click Deploy Branch. Heroku takes the latest commit on the main branch, installs dependencies, and then starts the server (yarn start). You can deploy your API in seconds with just one click.

Deploy from the dashboard

After you’ve deployed your application, click Open app

Open app

Opening your app to the default page will show a Swagger UI interface with the API specification for our app. We get this by adding functionality from the swagger-ui-express package.

Swagger UI

Create and Configure GPT

Creating a GPT is quick and easy. When you’re logged into https://chat.openai.com/, click Explore GPTs in the left-hand navigation. Then, click the + Create button.

Configure the initial settings

There are two tabs you can navigate when creating a GPT. The Create tab is a wizard-style interface where you interact with the GPT Builder to solidify what you want your GPT to do. Since we already know what we want to do, we will configure our GPT directly. Click the Configure tab.

Configure GPT

We provide a name, description, and basic instructions for our GPT. We also upload the logo for our GPT. The codebase has a logo you can use: resources/logo.png.

New GPT

For “Capabilities”, we can uncheck all of the options, as our GPT will not need to use them.

GPT Capabilities

Create new action

The “meat” of our GPT will be an action that calls our Heroku-deployed API. At the bottom of the Configure page, we click Create new action.

Create new action

To configure our GPT’s action, we need to specify the API authentication scheme and provide the OpenAPI schema for our API. With this information, our GPT will have what it needs to call our API properly.

For authentication, we select API Key as the authentication type. Then, we enter the value we set in our variables for BEARER_AUTH_API_KEY. Our auth type is Bearer.

Authentication configuration

For schema, we need to import or paste in the OpenAPI specification for our API. This specification let's ChatGPT know what endpoints are available and how to interact with our API. Fortunately, because we use swagger-ui-express, we have access to a dynamically generated OpenAPI spec simply by visiting the /api-docs/openapi.yaml route in our Heroku app.

We click Import from URL and paste in the URL for our Heroku app serving up the OpenAPI spec (for example, https://my-gpt-12345.herokuapp.com/api-docs/openapi.yaml). Then, we click Import. This loads in the schema.

OpenAPI schema

With the configurations for action set, we click Save (Publish to Only me).

Publish options

Now, we can test out some interactions with our GPT.

Using the GPT

Using the GPT example

Everything is connected and working! If you’ve been following by performing all these steps along the way, then congratulations on building your first GPT!

Conclusion

Experience in building and deploying custom GPTs sets you up to enhance the ChatGPT experience of businesses and individuals who are adopting it en masse. The majority of the work in building a GPT with an action is in implementing the API. After this, you only need to make a few setup configurations, and you’re good to go.

Deploying your API to Heroku—along with any add-ons you might need, like a database or a key-value store—is quick, simple, and low cost. When you’re ready to get started, sign up for a Heroku account and begin building today!

Originally published: March 28, 2024

Browse the archives for engineering or all blogs Subscribe to the RSS feed for engineering or all blogs.