Posted by Chris
We continuously use support data to identify high impact issues in our platform. Over the past couple of weeks in July, we reduced the volume of support inquiries related to Heroku Postgres by over a third — even as overall usage of the product increased. In this post, we'll tell you that story and a bit about how we do support here at Heroku.
Identifying High-Impact Support Issues
The way we approach customer support at Heroku is two-fold. On the surface, we’re here to answer your questions and help you fix issues with your apps. We also play an integral role advocating our customers’ needs within the company. The best support is the one you don't have to use, and as such, we strive to reduce the overall need to file tickets by bubbling support data up to the teams building the product. They then use this data to help guide their priorities.
One way we do this is by generating regular reports from aggregate tickets over a given period of time. Using an internal app, we pull a random sample of tickets for a given product during the specified time period. We then size the sample to make human review feasible while retaining statistical significance. This app allows us to assign each sampled ticket a “culprit”, producing a high-level grouping of issues. The culprit may be a specific feature, a UI/UX issue or even general guidance and questions. We can also add a freeform note to each ticket. Here is a chart from a report that was created in early July for our Postgres product:
Here we’re able to easily identify pgbackups and questions/guidance as the biggest contributors to customer pain on our Postgres offering. By drilling down into the ticket notes for those groups, we determined the main issue for pgbackups was a stability issue in capturing manual backups. In addition, we received a lot of queries about the use of automated backups versus when to rely on continuous protection.
With this data in hand, we’re able to show our product and engineering teams the very real impact that certain bugs or other issues have on our customers. In some cases, a support engineer can cut off the issue themselves by sending a pull request or updating docs and that's the end of it. With our pgbackups issue, we took this data to our engineering planning meeting and the Postgres team went on to tackle the issues.
First, we introduced better instrumentation around the manual backup process which led to replacing the mechanism used to upload the backups to durable storage. We also wrapped this mechanism in retry logic, resulting in more stable backup captures. Finally, since we had determined more broad confusion around the purpose of continuous protection vs. pgbackups, we published new documentation to address the user questions we saw in support tickets. Of course, as with all software, there are more ways we can improve pgbackups and we will continue to do so. These actions together had a measurable impact on the following week’s report:
In addition to the decrease in the questions/guidance and pgbackups categories, we also saw an overall reduction in Postgres-related tickets. This allows the next issue to surface. The notes from this new report indicate we have a lot of follow-up questions around our automated recovery notifications. The team is now exploring improving communication for these events.
As we rinse and repeat, existing features become more and more solid and newly introduced features are quickly polished.
Any ticketing system holds a wealth of knowledge on customer pain points with a product. It's important to look at this data from an appropriate lens. If we look at it too narrowly, it’s difficult to spot trends. If we look at it too broadly, the data loses a lot of its meaning. Striking the right balance allows us to bring actionable data to our product teams that they can use to improve a product and enhance the user experience. Supplying our product and engineering teams with the data they need to make good decisions is one of the primary goals of our support team. We’ll explore more about how our support team works behind the scenes in future blog posts.
Editor’s note: Support is available for all Heroku users through our Help app. For critical production apps and enterprises, our premium support offers guaranteed response times and 1:1 help in running your application.