Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: OtterTune – Automated Database Tuning Service for RDS MySQL/Postgres
164 points by apavlo on Oct 14, 2021 | hide | past | favorite | 63 comments
Yo. OtterTune is a database optimization service. It uses machine learning to automatically tune your MySQL and Postgres configuration (i.e., RDS parameter groups) to improve performance and reduce costs. It does this by only looking at your database's runtime metrics (e.g., INNODB_METRICS, pg_stat_database, CloudWatch). We don't need to examine sensitive queries or user tables. We spun this project out of my research group at Carnegie Mellon University in 2020.

This week we've announced that OtterTune is now available to the public. We are offering everyone a starter account to try it out on their Postgres RDS or MySQL RDS databases (all versions, AWS US AZs only). We have seen OtterTune achieve 2-4x performance improvements and 50% cost reductions for these databases compared to using Amazon's default RDS configuration.

I am happy to answer any questions that you may have about how OtterTune works here.

-- Andy

================

More Info:

* 5min Demo Video: https://ottertune.com/blog/ottertune-explained-in-five-minutes

* Free Account Sign-up: https://ottertune.com/try




Hi! This looks really cool and I like that you've published/presented at VLDB.

I'm a little curious why AWS is free/cheap, but anything other than AWS ends up in the "Enterprise, contact us for pricing" bucket. It might simply be based on your costs. I don't know if the on-prem stuff is expensive since your software needs to be in the same datacenter or if it's just a pricing differentiator.

I'm also wondering if you've thought of going the DBaaS route. It looks like AWS is charging an ~80% premium for RDS over regular on-demand instances. If you're managing a fleet of them and getting lower pricing (through committed-use discounts and negotiation) you could potentially get good margin there. For example, AWS charges $280/mo (on-demand) for an M5.2xLarge and $500/mo (on-demand) for the same instance for MySQL RDS (single AZ). So Amazon is already asking for an additional $220 and then you'd like $90/database ($450 / 5 databases on the standard plan). It seems like there might be more opportunity in the DBaaS markup than in trying to convince someone to use your tuning engine.

From your video, it doesn't sound like OtterTune does this, but it might be worth thinking about: maybe not just tuning the database, but also offering suggestions on potential anti-patterns happening. For example, if someone has code that should be using connection pools and isn't, it could be helpful to present it as something potentially important. Likewise, let's say that there are unindexed queries. Detecting that and recommending solutions is helpful. That seems like it could offer ongoing benefit.

Speaking of ongoing benefit, would there be a lot of benefit to paying for more than one month? In the 5 minute video, it showed the configuration stabilizing pretty quickly once it reached a good place. Assuming that my access patterns remain relatively the same, it seems like I could just continue with that configuration. Maybe I could sign up for one month every year to get a check-up. This isn't meant to be a criticism. It's more that I've seen so many customers exploit holes in SaaS pricing. If you're counting on recurring revenue, what happens if a customer notices that after a day they don't need the knobs tuned anymore? Can I get that $450 5-DB plan and run it on 5 databases on Monday, remove it from those 5 databases on Tuesday and apply it to 5 new databases on Tuesday (effectively getting 150 databases per month for the price of 5)? Or maybe there is a need to be tuning the configuration hourly/daily (though that seems less likely).

I do like the idea of auto-tuned knobs and it seems like a great service. I guess it's just hard to try if, like me, you're not on AWS. I'm sure that most people are so I'm not suggesting it's an inappropriate restriction for the free tier. Really cool and congrats on VLDB!


Lots to unpack here.

> I'm a little curious why AWS is free/cheap, but anything other than AWS ends up in the "Enterprise, contact us for pricing" bucket. It might simply be based on your costs. I don't know if the on-prem stuff is expensive since your software needs to be in the same datacenter or if it's just a pricing differentiator.

The expensive part for us about on-prem is the amount of custom code that we have to write for the agent to deal with how the organization maintains their config files. For example, one customer that we talked maintained TerraForm files for their MySQL confs in Github. So we had to modify the Github repo, push to main, that would then trigger a Github action that then pushed the update to database. Then we had to know when the updated config was deployed so that we could bounce the system (if it required a restart). Restarting the DBMS is tricky too, since everybody does it differently.

> I'm also wondering if you've thought of going the DBaaS route.

There is a lot to running a DBaaS (backups, configs, upgrades). At this point I think it would be very difficult to compete with Amazon/MSFT/Google/Oracle/etc if it was just stock Postgres/MySQL. You need a killer feature

> Speaking of ongoing benefit, would there be a lot of benefit to paying for more than one month?

You are asking about a floating license. This is a potential for enterprise customers. For now we track RDS databases by their ARN. So if you try to switch that on OtterTune, it treats it as a new database. This is necessary to make sure that the ML models don't freak out if there is a dramatic change all of a sudden.

The demo video was with TPC-C. The workload is stable. We've seen workloads that evolve a lot over the course of months, so it is unlikely that the same configuration would be optimal during this entire period.

> I guess it's just hard to try if, like me, you're not on AWS.

Reach out to us and I'm happy to talk to you about your setup: https://ottertune.com/contact

> Really cool and congrats on VLDB!

The science comes first.


free/cheaper network egress traffic and you will beat RDS.

See the RDS costing is not fully transparent. you pay CRAZY LOT for backups - backup to s3 is impossible, except in a weird parquet way. This is a cost lock-in.

you again pay CRAZY LOT for network egress. so i cant use a Vercel app with RDS. ill just spend a lot on network.

these two features and anybody will switch: 1. cheap network traffic 2. cheap backups.

charge me 20% above RDS (for the base machine) and ill pay happily.


> I'm also wondering if you've thought of going the DBaaS route.

A different take. Instead of coming up with one more DBaaS, they can just focus on making the existing DBs better and smarter that could potentially bring back a lot of those customers moved to cloud to back to on-prem.


> I'm a little curious why AWS is free/cheap, but anything other than AWS ends up in the "Enterprise, contact us for pricing" bucket.

Wild guess, but I'm assuming only AWS support is implemented right now but they want people to contact them to gauge interest.


I know this Show HN is about RDS specifically, but the site suggests this works elsewhere, too; any issues with pointing OtterTune at DBs making heavy use of extensions (e.g. Timescale, Citus) or nonstandard deployment approaches (k8s, patroni, vitess)?


We have deployed OtterTune for on-prem databases (baremetal, containers). But we are not offering that to everyone right now because organizations manage their configurations for these deployments in a bunch of different ways (Github actions, Chef/Puppet, Terraform). The nice thing about RDS is that they have a single API for updating configurations (https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_...).

As for the Postgres-derivates that you mentioned (Timescale, Citus), we have not tested OtterTune for them yet. But there is no reason they shouldn't also work because they expose the same metrics API (pg_stat_database) and config knobs. There are just way more people using Postgres (RDS, Aurora), so we are focusing on them right now.


Hi Andy.

First of all, congrats for the project.

As you probably know, you and any other user can indeed manage Postgres configurations, in any environment, via an API. See the "MANAGE" section in https://postgresqlco.nf/ Essentially, a hosted "Git repository" for Postgres configurations.


Alvaro! Good to hear from you. We really like the Ongres conf management tool. We just haven't come across anybody using it for on-prem DBs in the wild.

As I said above, everyone that we've talked to rolls their conf management tools because they are doing more than just DBMS confs (proxies, networking, kernel params, middleware, etc). Lots of Terraform.


Glad to hear that.

Yes, indeed we have considered doing a TF module for the configuration management API :)


What about OpenStack Trove DBaaS? OpenStack Trove is like an open source self-hosted Amazon RDS or Google CloudSQL. https://docs.openstack.org/trove/latest/

FWIU, Trove supports 10+ databases including MySQL and PostgreSQL.


AFAIU, there are sound reasons to host containers with e.g. OpenStack VMs instead of a k8s scheduler with a proper SAN and just figure out how to redundantly and resiliently sync - replicate, synchronize, primary/secondary, nodeprocd - and tune given the CAP theorem and the given DB implementation(s)?

Here's the official Ansible role for Trove, which provisions various e.g. SQL databases on an OpenStack cloud: https://github.com/openstack/openstack-ansible-os_trove


Makes sense! thanks for the reply Andy.


Here's a recent 50-minute talk on OtterTune given by Andy Pavlo:

https://m.youtube.com/watch?v=4d95fnGrSy8

I remember it contains two use cases other than AWS RDS.


Do you also plan to give optimization recommendations for the DB schema itself?

I'm thinking more specifically about indexes.

In my organization, we provide a solution where clients can run complex analytic queries periodically. We frequently have to tune the DB manually by adding an index here or there (or just brute force by switching to a bigger RDS...).

A product analyzing the query logs, combined with the estimated size of the tables plus some EXPLAINs run, and then using that to provide index/schema recommendations could help a lot.


I've been looking out for OtterTune for several years now, ever since the research paper was published. Congratulations on reaching live product stage!

edit: The original research paper: https://cs.cmu.edu/~ggordon/van-aken-etal-parameters.pdf


Forgot the band link: https://ottertune.bandcamp.com/music


Also Andy Pavlo is my #1 celebrity crush, for people who are not aware his DB lectures are great: https://www.youtube.com/c/CMUDatabaseGroup/featured


Andy keeps it real


OtterTune removed the AWS availability zone limitation in the free tier today. You can use the free version to tune RDS PostgreSQL or MySQL in any availability zone now.


What are the top three Postgres configurations Otter changes? e.g. parameter x from y to z


It depends on the workload (OLTP vs. OLAP, read-heavy vs. write-heavy). In our original research paper (https://db.cs.cmu.edu/papers/2017/p1009-van-aken.pdf), we came up with an automated way of determining which knobs have the most impact on performance. We've tested up to Postgres v13 in the commercial version of OtterTune.

For most workloads, shared_buffers and work_mem have the most impact. For write-heavy OLTP workloads, tuning the WAL (max_wal_size) and autovacuum knobs (e.g., autovacuum_vacuum_scale_factor) have the most benefit. For read-mostly OLAP workloads, Postgres' parallel knobs (max_parallel_workers_per_gather) provide the most improvement.

But you need to also tune all the other knobs to get the last 15-40% of potential performance improvement. This is what OtterTune can do.

More info about how we select knobs: https://ottertune.com/blog/prevent-machine-learning-from-wre...


Heads up, was reading the docs about how it works.

>The OtterTune Agent connects to the database to retrieve the necessary information. OtterTune only requires access to retrieve the database's runtime metrics and configuration information (knobs). It does need to access user tables or queries.

Presumably, the last line should say OtterTune does NOT need to access tables/queries?


Good catch. Fixed!


Details about our new self-service offering using Fargate is available here: https://ottertune.com/blog/ottertune-2021-10-product-update


OtterTune looks very cool.

I see you can setup using CloudFormation. Any chance terraform is on the roadmap?

On a personal note, thanks for everything you do. I’m a subscriber to the CMU Database Group channel and advocate of keeping it real.


> I see you can setup using CloudFormation. Any chance terraform is on the roadmap?

Yes, we will be adding support for TerraForm by next month.

> On a personal note, thanks for everything you do.

You're welcome! But database research is probably keeping me out of jail. That's why I have to keep going.

> I’m a subscriber to the CMU Database Group channel and advocate of keeping it real.

Don't thank me. Thank the Steven Moy Foundation for Keeping it Real (https://stevenmoyfoundation.org).


Hi Andy,

If I may, can you please shed light on why Peloton had to be archived and in essence re-done with OtterTune. Interested in your team's learnings from it from a software engineering point of view.

Some additional Qs:

- How did the team ensure this project doesn't suffer from the same disadvantages as its predecessor?

- What would you advise other teams undertaking a rewrite to pay off their tech debts?

How does this project compare to / contrast with Google's and SingleStore's efforts in this space?

Any chance we see you do a Peter Bailis and Sisu Data this? (:

Thanks.


> If I may, can you please shed light on why Peloton had to be archived and in essence re-done with OtterTune. Interested in your team's learnings from it from a software engineering point of view.

Peloton and OtterTune are completely different projects. Peloton was abandoned and rewritten as NoisePage (https://noise.page). OtterTune has always been OtterTune.

See this recent interview where I discuss why we gave up on Peloton:

https://www.ibm.com/cloud/blog/database-deep-dives-with-andy...

> - How did the team ensure this project doesn't suffer from the same disadvantages as its predecessor?

Again, different projects. OtterTune is all about not having to modify the internals of Postgres, MySQL, and any other DBMS. This is why we were able to support Oracle in the academic version in a short amount of time:

https://ottertune.com/blog/vldb-autonomous-database-tuning-i...

> - What would you advise other teams undertaking a rewrite to pay off their tech debts?

It is hard for to provide general advice for this question because every situation is different.

> How does this project compare to / contrast with Google's and SingleStore's efforts in this space?

I am not familiar with Google or SingleStore using ML in the manner that we are with OtterTune to tune configuration knobs. Or at least I have not seen anything public about it.

These days Oracle is the most aggressive with pushing automated tuning capabilities (Oracle's autonomous DBaaS, AutoPilot for MySQL Heatwave). The difference with these approaches and OtterTune is that right now we are focused on configuration tuning (to avoid data privacy issues) and our core approach is platform/DBMS agnostic.

> Any chance we see you do a Peter Bailis and Sisu Data this? (:

I don't know what you mean by this? Peter Bailis is the Ryan Gosling of databases.


Hi Prof. Andy,

I understand its complex for on-prem. Instead of you tweaking the product to be compatible for every other company why not let the companies to be compatible with your standard. I understand it's easier to be said than done. Especially in the early innings.

PS: I really want to try OtterTune but I don't have AWS RDS. I have on-prem (aka Raspberry Pi) and I will change everything the way you want to have OtterTune run on my Raspberry Pi.


I don't get the user account limit in products like this. On the free tier, sure.. limit me to 1 account. Fine.

But on a $450 plan a 1 user limit is just absurd. So what? me and my other fellow senior dev can't BOTH access your service for this amount of money when clearly the real limit you are paying for is the amount of databases you can monitor.


We overlooked this. You are right about the number of users being artificially low for our standard plan and we will adjust. More databases is the more expensive part on our side. Thanks for feedback.


Hey! This is really cool! We did the same but for Presto and Spark by following a similar approach (mostly focusing on Partitioning and Bucketing, but the intermediate metadata datasets ended up being leveraged by multiple teams in the company) Looking forward for the VLDB presentation.


Looks like only AWS is supported now. Will there be GCP Cloud SQL support? Seems very useful!


We have had very few requests to support GCP. After AWS and on-prem, the next most requested platform is Azure.

If you really want GCP, let us know here: https://ottertune.com/contact


Once a database has been tunned, why would people keep paying for the service?


It's true that for static workloads/databases, you don't need continuous tuning.

But as you add new app features, grow database size, upgrade DBMS versions, and make other changes, OtterTune tracks your system to make sure you are always using the best config. This is problem is even harder with 1000s of individual database instances. Customers have told us that they want something can tune database continuously in a way that is not possible with DBAs and existing monitoring tools.

Knob tuning is also the first step in the kind of automation that we are working on using OtterTune's ML-based approach.


If we leave just monitoring turned on, would we still see recommendations done by the tuning engine or do I need to enable auto time to start seeing those inputs from Ottertune?


Our expected usage for the current version of OtterTune is that you will enable tuning and then let the ML algorithms figure out your workload patterns. The models will then converge with an optimal config and then you switch it back into monitoring mode. How long this takes depends on your workload patterns.

We are working on the ability for OtterTune to alert you during monitoring mode when it thinks your workload/database have changed enough that it merits turning the tuning mode back on. We could also have it show the recommendations when this occurs as you suggest.


Does it automatically apply its recommendations? Or does it just propose them?

My DB configuration is about the last thing I want changing while I sleep.


Right now it automatically applies them. Reach out to us to tell us how you want the product to work with your setup: https://ottertune.com/contact


OtterTune allows you to set tuning schedulers, so you can ensure OtterTune doesn't tune while you sleep.


Do you plan to support self-hosted Postgres ?


Yes. It's on our roadmap for next year.


Hey, best of luck Andy. I've been following your db classes online and the launch of OtterTune on twitter.


Thanks! Word is bond.


Why is RDS Aurora in the enterprise plan?

Isn’t the main use case for it for savings? In other words less expensive setups.


We are still measuring how much we are able to optimize Aurora with customers in private beta deployments. Aurora is trickier than RDS because Amazon removed a bunch of config knobs when they rewrote the storage layer. So reducing Aurora costs requires more aggressive tuning strategies (e.g., autovacuum) than RDS, which is why we don't want to just let anybody use it right now. It requires more time on our part to make sure that things go smoothly.

Optimizing MySQL/Postgres RDS in the free version is low risk.


I understand this does not support MariaDB, right?


Correct. We could add support but most of our requests have been for MySQL and Postgres. Hit us up if you really want MariaDB: https://ottertune.com/contact


Wondering why they decided not to give free trials to EU Zones. GDPR?


We have done two major deployments of OtterTune in Europe. One of them we published a VLDB paper about (https://ottertune.com/blog/vldb-autonomous-database-tuning-i...). Their infosec people looked at the data OtterTune collects and determined that there are no GDPR issues.

We are incrementally expanding the AZs we support in the free tier in order to carefully scale our service. We plan to support EU zones by the end of the year.


Any way to sign up to get notified when it's in ap-southeast-2?


Reach out to us and we'll make sure you get notified: https://ottertune.com/contact


Done! Thanks.


Great, looking forward to trying this when it's in eu-west-2 (London)


Why is it called OtterTune?


It's play on the word "autotune".

Otters are also vicious animals, which is an important trait to have if you are working with databases: https://youtu.be/J7f6s2g8C0I


Linguistics nerd answer: sounds like "autotune" if you have the caught-cot vowel merger and alveolar flapping:

https://en.m.wikipedia.org/wiki/Cot%E2%80%93caught_merger https://en.m.wikipedia.org/wiki/Flapping


Tried and true pattern (see: DBeaver)


any plans for MSSQL?


MSSQL is the #1 requested DBMS for us after Postgres and MySQL (RDS and Aurora). Reach out to us about what you need and we can let you know if this is something we could add soon: https://ottertune.com/contact


It's for me to take this seriously when it starts with 'Yo.'




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: