Sunday, September 27, 2015

Relational Database scalability options in cloud

   Web-scale infrastructure gives us the edge of scaling the infrastructure (private or public cloud) and auto scaling of cloud instance shows us the new era of application scalability.  I have mentioned AWS auto scaling in my previous post and few best practice and application architecture requirement for horizontal scaling. Its little bit fuzzy when it come database horizontal scaling.

   Today we have multiple Database as a service options in different private and public cloud options like openstack trove and AWS RDS. Bitter truth is underneath is,  DBaaS is a cloud instance and it also does not solve the scalability issue.

   In order to simplify the description i will provide database scalability options as bellow.
   1) Vertical scaling:
            Scaling of compute and storage resource of master database when database request starts increasing.
             -> advantage: Easy
             -> disadvantage: Its very costly  and everyone will hit the ceiling eventually.

   2) Horizontal scaling (sharding):
          It kind of tricky business when it comes to horizontal scaling and in order to achieve the same. one need to analyze the the scalability requirement based on the application nature.

    Determine the sharding size based on constraints

  • Write IO
  • Read IO
  • Replication limit
  • Tolerance to large nos of systems
  • Budget

   Step 1: analyze the application read and write requirement on each module bases.

Read Replicas
After analysis of above resource requirement we can tell that database is read intensive and we need to horizontally scale by creating more read replica to distribute the read resource load to different database instances

The given solution solves our problem but what about if application is write intensive. can we create a replica to write database?
NO!! Database does not work that way. so now we need to logically segregate different modules to different databases and create their own read replicas.

But now the question is how we can distribute read and write SQL query requests to all these different databases?


Here the application architecture and developer has to work harder to implement the application logic to switch the query to different databases and/or use different database options like MySQL fabric for such cluster management. as described in billow diagram

    Before i conclude the post i have to mention that described architecture option is applicable in both the options (DBaaS and manual database deployment option in cloud instance).

     To make long story short i Horizontal scaling of database is ultimate destination for all the wen-scale applications only the horizontal scaling of DB and application can help us achieve capacity elastic and automated along with availability and resiliency scale.

     I hope the it was a helpful and simplified description. feel free to comment for your any valuable suggestions.