Monday, November 16, 2015

MySQL-Docker operations. - Part 4: Sandboxes, virtual machines, containers.

Previous episodes:

We're going to explore the choices and the differences between various types of deployments. We will consider four use cases:

  1. [Friendly]: Testing an application on a server where a different version of the same application is already installed (examples: a Python app requiring many libraries, a MySQL server);
  2. [Intrusive]: Testing a potentially intrusive application (anything that changes your general settings in /usr or /etc);
  3. [Conflicting]: Running a service that has lots of conflicting dependencies (an updated database driver compiled with a version of MySQL different from what you have installed);
  4. [Intractable]: Running an intractable service, one of those that require a specific user to run and assume they have full control of the operating system (e.g. Postgresql, Oracle).

For each case, we need to determine the impact on our well being. We assume that the user starts with one reasonably powerful server.

The method used will affect our operations in several ways:

  1. Cost: How much would it cost to implement this method.
  2. Time: How much time will be needed to get things done.
  3. Performance: Can we run things as fast as we need.
  4. Ease of use: Can we get things done without reading a lengthy manual or using an unforgiving and complicated procedure;
  5. Isolation: Can you run your server without affecting other servers?
  6. Storage: Can we add or change storage easily?
  7. Scalability: Can we easily repeat the procedure as many times as needed.
  8. Availability: Can we run any service using this method?
  9. Portability: Can we run this service on several operating systems?
  10. Networking: Can we use this method to run operations that require a network?

Running servers on a regular host

The first possibility to solve our problem is simple. Take an empty server, install the service, run it. Until not long ago, before the advent of cloud computing, this was the only way to run operations: if your server is not enough, buy a bigger one, or buy many small ones and get smarter with them. But inevitably, whether we wanted to install a new service or test a new version of a known application, we needed to find money and physical space to get the job done.

Regular apps

Figure 1 : Applications within a server share the operating system and library resources

In this configuration, everything is by the book. We assume that we will use one physical host to run a main service, using the best configuration we can get to achieve the purpose.

The evaluations in the following table are based on my own experience and may differ from what others feel or need.

Requirement score notes
Cost –10 You need to own a new server
Time 8 You need to install it, but it can be easily automated
Performance 10 Nothing beats bare metal
Ease of use 8 As easy as the installation procedure makes it.
Isolation 10 Not going to affect services in other machines.
Storage –10 Changing storage requires physical manipulation
Scalability –10 Every new server requires a new purchase
Availability 10 We can run anything.
Portability 10 We can install the O.S. that we need, and the services on top of it
Networking 0 We can use, but can't create or simulate networking.
Total +56 / -30

The negative results should be considered separately from the positive ones. What could be a prohibitive condition for an individual could be merely a nuisance for someone in a stronger position. For example, if you already have access to bare metal servers for the next two years, thanks to an advantageous merger, you may not feel the cost factor to affect you too much.

Evaluation for the bare metal servers usability:

  • [Friendly]: easily used. No problems here.
  • [Intrusive]: difficult to use. Installing one of those means that you may have trouble installing anything else.
  • [Conflicting]: Extremely difficult. You may end up with the inability of upgrading a given service unless you also upgrade all the dependencies, and end up upgrading the whole operating system out of desperation.
  • [Intractable]: Extremely difficult. Once you install one of those, you may not be able to use the server for anything else.

Running servers in a sandbox

In this context, by sandbox I mean an application that runs on a server with strict configuration settings that prevent it from misbehaving. One example for this category is MySQL-Sandbox, where one or more MySQL servers are installed in a host, each of them configured in such a way that it does not clash with the others.

Sandboxes

Figure 2 : Sandboxes are regular applications that were carefully configured to behave well without disturbing the neighbors.

While MySQL-Sandbox is designed for testing, deploying several production servers on the same host is a common practice. The main reason for it is that commodity servers have become more and more powerful, but the software hasn't caught up to utilise such power to its fullest. In this context, using a single server on such powerful hosts would be a waste, while installing two or three servers would provide for better effectiveness.

This type is similar to running a plain bare metal server. You are running your MySQL server very close to the metal, as there is no software layer between the server and the operating system. Applications configured this way are as fast as the hardware allows. However, they are not as secure. While a lonely server running inside its dedicated host does not have to worry about clashing, a sandbox is sharing libraries and other operating system resources to other similar servers, and a clash is easy to provoke. It would be enough to mix up the configuration settings, and one or more of them would either stop working or corrupt data. Or it could happen that a sandbox could drain all the resources (e.g. the main memory) leaving all the other contenders in the cold.

Requirement score notes
Cost 10 No investment required
Time 8 As easy as the installation procedure makes it
Performance 10 Still bare metal, even if there is potential concurrency.
Ease of use 10 As easy as the manual says it is
Isolation -5 Depends on the service configuration.
Although it is functionally independent, the services can clash.
Storage 5 Sandboxes can be resized at will (within the limits of existing storage).
Scalability 10 Deployment of new instances is only limited by the host resources.
Availability 5 We can run only applications that are fully configurable.
Portability -5 We can only run applications for the host O.S.
Networking We can use but can't create or simulate networking.
Total +58 / -10

There are several advantages to using sandboxes instead of a dedicated host, such as being able to deploy multiple servers without buying new hardware or installing virtual machines. There are, however, obvious limitations, like the lack of isolation mentioned above and the fact that only applications compiled for the host operating system can run in this fashion.

Evaluation for the sandboxes usability:

  • [Friendly]: easily used. This is the strong point of sandboxed applications.
  • [Intrusive]: Difficult to use. Sometimes impossible.
  • [Conflicting]: Difficult but possible to use. It's one of the case where having a conflicting application used in a parallel environment could be beneficial.
  • [Intractable]: Almost impossible to reduce to a sandboxed environment.

Running servers in virtual machines

VM

Figure 3 : A virtual machine isolates the application and the operating system.

Virtual machines are the heart of current cloud computing strategies. The ability of creating servers that behave almost like bare metal ones –without need for physically buying them and transporting into a data center– has changed the economy of most companies in the past decade.

Requirement score notes
Cost -5 Moderate investment required
Time As easy as the installation procedure makes it. But the O.S. must be installed as well
Performance -10 There is much overhead from the additional layers and the need of having a full O.S..
Ease of use 8 Everything that is allowed through the interface.
Isolation 9 It can be as good as a physical host.
There is still the risk of a VM affecting negatively others.
Storage 5 V.M.s can be resized at will (within the limits of existing storage).
Scalability 10 Deployment of new instances is only limited by the host resources.
Availability 10 We can run anything.
Portability 10 We can install the O.S. that we need, and the services on top of it
Networking 10 We can use and create networks.
Total +62 / -15

Compared to bare metal, virtual machines can scale at will. You can deploy in a few minutes a new VM of the size that is needed for your current business, and get rid of it when the need ends. Unlike sandboxes, you can run any operating system and any application. In addition, you can have a network for public and private communication between servers.

There are prices to pay. First of all, it will cost you. Depending on the usage, they could be much cheaper than buying and storing your own physical servers, but they won't be free. Sure, you can install a virtual machine in your initial server, the same way that you can do it for a sandbox, but then you get into the second great limitation: performance. Even with the best software available today, the performance of a server running in a VM is greatly inferior to a server on bare metal.

You can compensate for performance by splitting the job into many parts and deploying many small virtual machines that will work in parallel. When a solution like this is successfully deployed, the performance of the group of virtual machines can surpass that of a single bare metal server. Unfortunately, to achieve this goal, you would incur more costs than buying a single server, and your application will need to be adapted to working in a distributed environment. This solution can work, and it has been deployed successfully in many cases, but it is not a one-size-fits-all, and done with poor planning can backfire.

Evaluation for the virtual machines usability:

  • [Friendly]: easily used. No problems here.
  • [Intrusive]: Easily used with overhead. Just install another virtual machine.
  • [Conflicting]: Easily used with overhead.
  • [Intractable]: Easily used with overhead.

Running servers in containers

Docker

Figure 4 : Docker containers are thin layers of libraries and applications on top of a common kernel.

Containers are a growing trend in the virtualization ecosystem. If, by the previous statement, you believe that containers are virtual machines, you need to reconsider immediately, or risk failing to understand this technology. Containers are not virtual machines, although they have many things in common. Like virtual machines, containers are entities that are not in the host computer, can be deployed in a package, started, and the service inside it can be used more or less like a server on bare metal.

The differences between virtual machines and containers are a few, and very important:

  • A container does not pack a full operating system, but just a thin layer of the needed libraries to run the service in it;
  • The service itself is often a stripped down version of the original application.
  • Most important, the software in the container uses the host kernel directly, without any intermediate layer.
  • For the above reasons, while a virtual machine starts up in minutes, a container starts up in less than a second.

A container is a well packaged application that can be downloaded very quickly, and once downloaded can be instantiated several times with incredible speed.

Another notable difference between containers and virtual machines is that containers are less isolated, because they use the same kernel as the host, rather than a virtualized one. On one hand, this makes containers less secure, on the other hand, they are blazingly fast.

Docker shared

Figure 5 : Docker containers can share libraries and other image layers

There is another reason for containers speed and low storage occupancy. Docker containers are deployed in layers. Some of those layers can be used by a single container, others could be in common between two or more containers. While a virtual machine is an enormous blob which can reach several GB, a container could be a thin modification of an existing image, and thus can be downloaded in seconds and deployed even faster.

Requirement score notes
Cost 10 No investment required
Time 10 Fast, fast, fast!
Performance 9 Almost as fast as running on bare metal. Tiny overhead.
Ease of use 3 Requires some learning and new workflows.
Isolation 7 Much better than a sandbox.
Less than a V.M., because containers use the same kernel.
Storage 5 Containers can be resized at will (within the limits of existing storage).
Scalability 10 Deployment of new instances is only limited by the host resources.
Availability 3 We can run only applications that have been adapted for containers.
Portability -5 We can only run applications for the host O.S.
Networking 10 We can create and use netweoks.
Total +67 / -5

What are the strong points of containers? Low cost (or no cost, if all you need is what fits in your current server), good performance, private networking, easy to scale.

The limitations, as of today, are portability (applications can only run in the same OS as the host) and the ease of use. This is a point that is going to change. Using containers requires some changes in the applications (or finding ready made images) and an understanding of the environment, which could be intimidating for people used to the old ways. But once you get past the initial learning phase, everything feels very easy, and eventually the usage will be far easier than the old ways.

Evaluation for the containers usability:

  • [Friendly]: easily used. No problems here.
  • [Intrusive]: easily used, with little or no overhead.
  • [Conflicting]: easily used with little or no overhead.
  • [Intractable]: difficult to use, sometimes impossible if the intractable application or service was built without flexibility in mind.

All solutions comparison

For convenience, I made a table with a comparison of the solutions examined above.

I must stress that these evaluations are my own, very much subjective, based on my experience. The evaluations may differ from others, and possibly also from my own in a few years or months. Talking about Docker is like catching eels: it's a moving target where the technology evolves and improves daily. This fluidity is possibly the most appealing characteristic of Docker and the container related technology: its evolution has been and continues to be fast and effective, addressing the users needs at incredible speed.

Requirement Bare metal Sandbox Virtual machine Container
Cost –10 10 –5 10
Time 8 8 0 10
Performance 10 10 –10 8
Ease of use 8 10 8 3
Isolation 10 –5 9 7
Storage –10 5 5 5
Scalability –10 10 10 10
Availability 10 5 10 3
Portability 10 –5 10 –5
Networking 0 0 10 10
Total +56 –30 +58 –10 +62 –15 +66 –5

I believe we haven't seen the end of this trend yet. What we have seen so far with containers and virtual machines seems to aim at an architecture built on micro services. Containers could take a substantial role in the transition towards that reality.

What can we take away from this analysis?

  • Bare metal servers are not outdated yet. There are still cases where they are irreplaceable. Despite the cost associated with their usage, they are not extinct yet, but just.
  • Virtual machines are still in charge of the scalability department in many cases. However, they feel the advance of containers and need to either evolve or merge into a more flexible architecture to deal with increasing demands from users.
  • Containers are the new force in IT. They can play well with both bare metal servers and virtual machines, waiting for the rise of container-oriented operating systems, which already exist and aim at world domination in a not distant future.

I see a future where the rise of containers and micro systems will force software makers to simplify their products and make them more modular and easy to play with. This trend is important in the current cloud architecture and will become vital when containers take over.

In the meantime, I am not giving up MySQL-Sandbox, which is still indispensable I'm most scenarios, but I am starting to rethink the architecture to fit smarter future uses.

MySQL deployment summary

With all the above considerations, where do we stand with MySQL? My view is that we're still in middle ground. MySQL is still used heavily on bare metal, either as a stand-alone server or as a part of multi server deployments in the same host.

It is also massively employed in the cloud, where it offers many advantages for deployment flexibility and ease of scalability. Yet it still lacks the agility necessary to be a native cloud component. There are several attempts at creating a better cloud player out of MySQL, some successful, some less so.

When it comes to containers, MySQL has still much work to do to become an efficient building block in the new ebullient architecture expansion. The MySQL team provides an official package, which is a first step towards becoming a good player. But in the near future there will be demands of more integration and better modularity than what's available today. Looking at the internals of MySQL deployment in a container shows that the system is struggling to adapt to the new medium. I see the container revolution as an opportunity for established applications like MySQL to improve their usability and increase their ability to play well with other components of the emerging IT infrastructure.

What's next

In the next (and last) episode we will see MySQL, Docker and orchestrating tools playing together to deliver faster and more powerful operations.

No comments: