Boost serverless app performance with RDS Proxy and Amazon Aurora

When the RDS Proxy service was launched, I started working together with my colleague, Felipe Mejia, to test the service and see how it could be used in serverless architectures. We started by testing several Lambda functions executing simultaneous write tasks on an Amazon Aurora MySQL database and examining how RDS Proxy would help us in these types of scenarios. After several tests and metrics, I want to share what I have learned.

RDS PROXY is a fully managed AWS service that serves as a proxy layer between the application layer and the database layer, which takes care of:

  • Pooling and sharing database connections.
  • Increases app availability.
  • Improves data security.

However, when we started working with serverless architectures and relational databases (Aurora MySQL), we encountered some interesting challenges:

  • DB performance and connection management: Having simultaneous connections means our database has to use its computing resources to manage these connections and support their escalation in the event of a traffic peak.
  • Failover time: When having critical applications in serverless architectures, availability will always be a priority, for this reason, we must seek an efficient cost relationship between services and availability.
  • Security: How to handle connection string, username, and password for the database through multiple lambda functions?

Typical Serverless Architecture with RDS Proxy

RDS Proxy

Connection Management

  • Multiplexing: The proxy can reuse every connection after a transaction in your session. This transaction-level reuse is called multiplexing.
  • Borrowing: It happens when the RDS Proxy removes a connection from the Pool to reuse it. Once finished, it returns to the pool.
  • Pinning: In some cases, the RDS proxy is not sure if it can reuse a connection outside the session. In these cases, the session is kept on the same connection until the session ends.
Amazon Aurora

Failover time

Failover can happen when you have a problem with the master instance, when you execute an update, or due to connectivity problems. During a failover, the RDS proxy continues to accept connections from the same source and automatically directs them to the new instance that will act as the master instance.

During these failovers clients will not be susceptible to:

  • Domain Name System (DNS) propagation delays on failover.
  • Local DNS caching.
  • Connection timeouts.
  • Uncertainty about which DB instance is the current writer.
  • Waiting for a query response from a former writer that became unavailable without closing connections.

Security

RDS Proxy supports TLS protocol version 1.0, 1.1, and 1.2. You can connect to the proxy using a higher version of TLS than you use in the underlying database.

In order to connect the Llambda function to the database, everything must be done through the Secrets Manager service, where there is a secret that is configured in the Proxy. At the Llambda level, we point it to the RDS Proxy.

RDS Proxy

Infrastructure deployment: The entire infrastructure was deployed with CDK + TS, which was an interesting challenge; however, the documentation for TS is not as complete when compared to CDK + Python.

Database configuration

RDS Proxy

Aurora Master Database Instance

Aurora Standby Database Instance

Lambda Function

It was developed in Python 3.7. It is a script that, in two cycles,  does recurrent writing tasks in the database. n turn, it shows when it’s writing each of the records on the database:

In the described scenario with all the Lambda functions writing tasks on the database, failover was executed in 2 scenarios. We ran a total of 15 tests on each of them:

  • Failover without RDS Proxy: in this scenario, we had database unavailability of 10–12 seconds while failover.
  • Failover with RDS Proxy: in this scenario, there were only 30% of the tests with 1 second unavailability… In the remaining 70%, there was no unavailability.

After we conducted the tests in our scenario, we evidenced a 90% improvement in failover over time.

Lessons learned

  • Whenever AWS releases a new service, I suggest waiting 3–6 months to use it in production environments. Since the documentation is usually not complete, AWS support doesn’t know the service well and you can have a hard time trying to do complex things.
  • If you are going to implement RDS Proxy, take into account the costs and how they can impact the project.
  • For productive environments, I definitely recommend the use of a proxy to improve security, connections, and failover.
  • During tests performed, the failover time was reduced by 90% and in several cases, there was no unavailability.
Facebooktwitterredditlinkedinby feather

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>