Baixe pratical linux infrastructure e outras Notas de estudo em PDF para Engenharia Elétrica, somente na Docsity! Ali Shelve in Operating Systems/General User level: Intermediate www.apress.com RELATED BOOKS FOR PROFESSIONALS BY PROFESSIONALS® Practical Linux Infrastructure Practical Linux Infrastructure teaches you how to use the best open source tools to build a new Linux infrastructure, or alter an existing infrastructure, to ensure it stands up to enterprise-level needs. Each chapter covers a key area of implementation, with clear examples and step-by-step instructions. Using this book, you’ll understand why scale matters, and what considerations you need to make. You’ll see how to switch to using Google Cloud Platform for your hosted solution, how to use KVM for your virtualization, how to use Git, Postfix, and MySQL for your version control, email, and database, and how to use Puppet for your configuration management. For enterprise-level fault tolerance you’ll use Apache, and for load balancing and high availability, you’ll use HAProxy and Keepalived. For trend analysis you’ll learn how to use Cacti, and for notification you’ll use Nagios. You’ll also learn how to utilize BIND to implement DNS, how to use DHCP (Dynamic Host Control Protocol), and how to setup remote access for your infrastructure using VPN and Iptables. You will finish by looking at the various tools you will need to troubleshoot issues that may occur with your hosted infrastructure. This includes how to use CPU, network, disk and memory management tools such as top, netstat, iostat and vmstat. Author Syed Ali is a senior site reliability engineer manager, who has extensive experience with virtualization and Linux infrastructure. His previous experience as an entrepreneur in infrastructure computing offers him deep insight into how a business can leverage the power of Linux to their advantage. He brings his expert knowledge to this book to teach others how to perfect their Linux environments. Become a Linux infrastructure pro with Pro Linux Infrastructure today. SOURCE CODE ONLINE 9 781484 205129 55499 ISBN 978-1-4842-0512-9 For your convenience Apress has placed some of the front matter material after the index. Please use the Bookmarks and Contents at a Glance links to access them. 1 Chapter 1 Managing Large-Scale Infrastructure This chapter is about the issues that come up in infrastructure management at a large scale. Mission-critical decisions such as infrastructure architecture, and decisions on matters such as licensing and support, are all part of infrastructure design. When managing a small infrastructure, many of these issues do not have the same criticality as they do in a large enterprise. A scalable architecture is an important foundation for a successful, large infrastructure. Infrastructure management can be divided into two components: application deployment and infrastructure architecture. First we review application deployment and then look at different infrastructure architecture models. We also review the components that make up a scalable infrastructure. Application Deployment From design to production, a scalable infrastructure enables developers to code, test, and deploy applications quickly. The traditional deployment model has not been effective at speeding up the design-to-production pipeline. Figure 1-1 shows the traditional deployment model where a major hurdle to reaching production is testing. After developers have written code, they pass it to a quality assurance (or QA) team that tests the code manually and sends it back to the development team for fixing. After the fixes are complete, the code is tested again. When the tests pass, the code is sent to staging. Staging is a replica of production. After the code passes through staging, the operations team deploys the code to production using a change management process. This entire process for a single line of change in code might take two to four weeks. Figure 1-1. Traditional design-to-production model Realizing that this model is not a very effective one, a newer model came into existence—that of continuous integration and continuous delivery (CI/CD). With this model, as soon as unit testing is complete and code is checked into the source code repository, the CI/CD pipeline kicks in. The pipeline consists of an automated build system, using tools such as Jenkins (http://jenkins-ci.org). The build system takes the code and builds it, outputting a compiled binary. The binary can then be taken and installed into production in an automated manner (Figure 1-2). Chapter 1 ■ Managing Large-SCaLe infraStruCture 2 The CI/CD pipeline automates as many tests as possible to gain confidence in the code. In addition, it performs a complete build of the software system and ensures the code that has been checked in does not cause anything to break. Any failure has to be investigated by a developer. There are numerous software products that can help implement a CI/CD pipeline, including the following: Jenkins (• http://jenkins-ci.org) CruiseControl (• http://cruisecontrol.sourceforge.net) Buildbot (• http://buildbot.net) Software Development Automation Software development automation is the process of automating as many components of software development as possible. The goal of software development automation is to minimize the time between when a developer checks in code to the time a product is in the hands of an end user or is being used in production. Without software development automation there is a significant delay in getting software products to end users. There are various components of software development automation, a few of them are listed in the following sections, along with the tools that can help accomplish the automation needed. Build Automation Build automation is the process of automating the immediate tasks that need to happen after a developer checks in code. This process includes actually building the code. The build is triggered on a check-in by the developer. Some of the features of build automation are as follows: Frequent builds to catch problems with code sooner than later• Incremental build management, an attempt to build modules and then integrate them• Build acceleration, so that a large code base does not slow down the build• Build status reporting and failure detection• Some of the tools that can help with build automation are the following: Gradle (• http://www.gradle.org) Apache Ant (• http://ant.apache.org) Apache Maven (• http://maven.apache.org) Gnu Make (• http://www.gnu.org/software/make/) Figure 1-2. Continuous integration and continuous delivery Chapter 1 ■ Managing Large-SCaLe infraStruCture 3 Software Configuration Management Software configuration management (SCM) is an integral component of software development automation. A few questions that need to be answered in software configuration management are Which source code management system do we use?• How do we do code branches?• What kind of revisions in software will we support?• Anything that has to do with managing the actual code itself is considered part of SCM. The following is a list of some of the tools that can help with the source code control part of SCM: Git (• http://git-scm.com) SVN (• https://subversion.apache.org) Mercurial (• http://mercurial.selenic.com) SCM is more of a process than a selection of tools, and includes questions about how to manage branches, versions, merging, and code freeze, and should be answered with policy rather than a product. Continuous Integration There can be many different components of a large software project. After a developer checks in code, it has to be integrated with the rest of the code base. It is not feasible for one developer to work on integrating his or her components with all other components. Continuous integration (CI) ensures all different components of code are eventually merged into one mainline that can then be built and a complete product released. The components of an effective CI are as follows: The CI process should start automatically with a successful build of code.• Developers should commit code frequently to catch issues early.• The CI build should be self-testing.• The CI build should be fast.• An example of a CI process is that of a web services application divided into a PHP front end and a Java back end with MariaDB as a database. Any change on the PHP front-end code should trigger a full CI pipeline build that also builds Java and does a full test that includes database calls. Continuous Delivery So far, we have automated builds using build automation, we have integrated our code with other components using CI, and we have now arrived at the final step of continuous delivery (CD). CD is the process of getting code out to production as soon as a successful CI is complete. This is one of the most crucial steps in software development; a botched CD can cause an outage in your environment. The output of CI can be packaged binary. Software that can help with CD includes the following: Gnu AutoConf (• https://www.gnu.org/software/autoconf/) Go (• http://www.go.cd) Chef (• https://www.getchef.com/chef/) SaltStack (• http://www.saltstack.com) Chapter 1 ■ Managing Large-SCaLe infraStruCture 6 For both software and infrastructure design, the waterfall model has fallen out of favor because of its limitations; Agile has become more popular. Agile Methodology Agile is an umbrella framework for numerous project management methods based on the Agile manifesto. The Agile manifesto is as follows: “We are uncovering better ways of developing software by doing it and helping others do it. Through this work we have come to value: • Individuals and interactions over processes and tools • Working software over comprehensive documentation • Customer collaboration over contract negotiation • Responding to change over following a plan That is, while there is value in the items on the right, we value the items on the left more.” (http://en.wikipedia.org/wiki/Agile_software_development) Agile has gained popularity over the waterfall methodology because the chances of having a successful project increase dramatically when Agile is followed correctly. One of the reasons of increased chances of success is that, with Agile, it is not assumed that all requirements are known upfront, and at the beginning of each sprint, the requirements can be reviewed again. Agile includes numerous methods: Kanban (• http://en.wikipedia.org/wiki/Kanban_(development)) Scrum (• http://en.wikipedia.org/wiki/Scrum_(software_development)) Lean (• http://en.wikipedia.org/wiki/Lean_software_development) There are many more Agile methods. You can read about them at http://en.wikipedia.org/wiki/Agile_software_development. Agile methodology for infrastructure engineering is increasing in adoption due to the need of developers to do both development and operational tasks. Chapter 1 ■ Managing Large-SCaLe infraStruCture 7 Scrum Let’s take a look at the Scrum methodology, which is part of the Agile framework. Scrum is a framework for learning about the products and the process used to build them. Figure 1-4 shows Scrum in action. Scrum provides roles, meetings, and artifacts. There are three roles defined in Scrum: product owner, Scrum development team, and Scrum master. The product owner focuses on “what” over the “how.” He interfaces with customers and figures out a product road map. He prioritizes the product backlog, which contains a list of software features wanted by customers. He is the single point of contact for interfacing with the team. The development team is cross-functional and consists of developers, testers, and the Scrum master. The team is comprised ideally of four to nine people. The Scrum master has no management authority over the team. He or she is a facilitator, protects the team from distractions and interruptions, and removes hurdles the team encounters. He or she also promotes good engineering practices, and acts as a checks and balances person. You can read more about what a Scrum master does at http://scrummasterchecklist.org. Scrum consists of two important artifacts: the product backlog and the sprint backlog. The backlog is a ranked list of everything that needs to be done. If it’s not in the backlog, then it not something the team should be working on. Items in the backlog are written in user story format, or in use case scenarios. A user story is a documented requirement from an end user written from the users perspective. For example, “I want a website that allows me to sell clothing online” could be a user story that translates into a more defined backlog. The sprint backlog is what the team is currently committed to doing, and as such it has an end date. Sprints are usually not longer than two weeks. Sprints consist of a working, tested, shippable product. A sprint is a short iteration. There is the “what” and the “how” in the sprint backlog. The “what” is the product backlog item for this sprint. The “how” is the not-yet-started, in-progress, and completed tasks in the sprint. There are four meetings defined by Scrum (Figure 1-5): 1. Sprint planning meeting 2. Daily Scrum 3. Sprint review meeting 4. Sprint retrospective meeting There could be one more meeting, called the backlog-grooming meeting. Let’s assume the sprint is a two-week sprint. At the beginning of the two weeks, a sprint planning meeting is held. During the meeting, items that are Figure 1-4. Scrum at work Chapter 1 ■ Managing Large-SCaLe infraStruCture 8 high priority are taken from the backlog and discussed. The outcome of the meeting is a ranked list of tasks for that particular sprint. Every day there is a 15-minute daily Scrum meeting. During the meeting, three things are discussed: what you did yesterday, what you plan on doing today, and what the blockers are. Sprint review meeting is at the end of each sprint, before the sprint retrospective meeting. In the review meeting items that were completed from the backlog are shown, and any items that could not be completed are discussed and placed in the product backlog. The sprint ends with a sprint retrospective meeting during which the team discusses what went well in the given sprint and could have been done better. During the sprint backlog-grooming meeting, the product backlog is reviewed. Large items are broken down into smaller items that can be done in two weeks. Also, the items in the backlog may be ranked. Figure 1-5 provides an overview of the meetings. Scrum is one framework within Agile, and your team should review the other methods mentioned earlier, such as Lean, Kanban, and others, and pick one based on what suits the team the most. You can also pick the best processes of each of these methods and combine them into a new method that works for you. Following Agile can speed up infrastructure project delivery and help your enterprise operate at a large scale. Even for small teams, following Agile can help in delivering infrastructure sooner rather than later to both internal and external customers. Resources to learn more about Agile can be found at • http://agilemethodology.org • http://scrummethodology.com • http://agilemanifesto.org Web Architecture So far in this chapter we have looked at software development and project management methods for scalable infrastructure design. The next section of this chapter focuses on a scalable web architecture. There are numerous designs that can be used for web architectures. I start off explaining the most basic one—single tier—and move on to more complex designs. Single-Tier Architecture In a single-tier architecture, there is one server that hosts the application and is being accessed remotely. This is the most basic design in infrastructure engineering for web services. The single server might host a web server and a database server, if needed. There are no failover capabilities and it is difficult to grow this architecture horizontally. Figure 1-6 shows this type of design. It is not recommended to use this design unless you have a non-mission critical, very low-volume use application that you want to use for testing and development. Figure 1-5. Scrum meetings Chapter 1 ■ Managing Large-SCaLe infraStruCture 11 Global Architecture In today’s global world, your users can be anywhere. Consolidating data centers and pushing content through one or more central data centers that host your content may not be sufficient. One way to improve efficiency is to build point-of-presence sites globally. These are small caching-only sites that reside close to your users and can push content to users with low latency. Figure 1-12 presents an example of such a design, with two core data centers—one in Chicago and the other in Austin. The other sites, such as in Taiwan or Geneva, are caching-only sites that contact the two primary data centers, download data used more frequently by the users in that particular region, and store it. You need an application such as Apache Traffic Server (http://trafficserver.apache.org) running in your point- of-presence sites to support such a design. If building such an infrastructure is cost prohibitive, then consider using a CDN, such as those offered by Akamai, Amazon, Rackspace, and other providers, to achieve the same results. Figure 1-11. Six-tier architecture Figure 1-12. Global scalable architecture Chapter 1 ■ Managing Large-SCaLe infraStruCture 12 Autoscaling Autoscaling is the process of increasing or decreasing infrastructure resources to meet varying traffic demands. Let’s take the example of Netflix. Their infrastructure might see a large spike in incoming streaming requests on Saturday morning, when a lot of children are home and want to watch cartoons. On the other hand, they may have much lower use on a Monday morning, when most people are at work and children are in school. To handle this variance in traffic, autoscaling might come in handy. During peak hours, the load balancer can check uptime or some other metric from the real web servers and, based on a threshold, spin up new instances of web servers (Figure 1-13). When traffic slows down, the virtual instances can be deleted (Figure 1-14). This strategy allows for the cost to Netflix to vary based on need, and they do not have to spin up and keep the high-watermark instances up and running all the time. Auto scaling is offered in Amazon Web Services and in other clouds, such as Google Cloud. For private clouds, you have to create your own autoscaling mechanism unless your private cloud vendor provides it. Rolling Deployments When a new version of code is available in production for deployment, the traditional approach for upgrading existing code in production was to schedule some downtime during which the service is not available. This was often called a maintenance window, and is still used by some organizations. During the maintenance window, all users are drained from a service, and access to the service is disabled. Code is upgraded and tested, and then traffic is allowed back into the production cluster. A disadvantage of this method is that, if your web site is revenue generating, you will lose revenue during the maintenance window. Especially in today’s global setting, where your external-facing web site may be accessed from anywhere in the world, keeping your site up 100% of the time is becoming more crucial. Figure 1-13. Peak usage Figure 1-14. Autoscaling during normal traffic patterns Chapter 1 ■ Managing Large-SCaLe infraStruCture 13 An alternative approach to this is rolling deployments. Rolling deployments entails upgrading a small portion of your infrastructure, testing it, and then upgrading more servers. There are two ways of doing this: one method is to deploy code on new servers, or virtual machines, that will be added to the pool behind a load balancer. If things appear to work OK on the new servers, then you keep adding more new servers with the new code base. After you reach 100% of new code base servers, you can then start to turn off the old code base servers. Another approach is to replace an existing server in the pool with a newer version of code (Figure 1-15). Of the eight servers for a given application, all eight are running v1.0 of the code, but we have to upgrade the code to v2.0. In Figure 1-16, we replace a single server with v2.0 of the code. We can do this in two ways: either upgrading code directly onto the server or shutting the server down and creating a new server with v2.0 of the code. We then monitor this new server that receives one-eighth of the traffic if the load balancer is using a round-robin load-balancing methodology. If the newer version of the code is functioning well, we then proceed to change 50% of the servers with the new code (Figure 1-17). As before, we can simply deploy new virtual machines and replace the older ones, or we can upgrade the code on them without actually installing new virtual machines. Figure 1-15. Server running 100% of v1.0 code Figure 1-16. One-eighth of servers running the new version 2.0 of code Chapter 1 ■ Managing Large-SCaLe infraStruCture 16 The same solution can work at times for internal and external customers, with different views. You can also have homegrown systems or purchase commercial products. Regardless of your choice, to scale, you must keep track of issues and use extensive reporting so that problem spots can be figured out and resolved sooner rather than later. Network Operations Center Another important component of operating at scale is having 24/7 support for your products or services. This can be accomplished through various ways, including the following: • Follow the sun’s rotation: Have staff working in the world where there is sunlight—for example, starting off with teams in San Francisco, who then hand off to India, who then hand off to, say, Ireland. • Have a single global location with 24/7/365 coverage: This means having a night shift as well in the single location. • Overlap shifts: Have one shift that overlaps in part with another. An example is to start off in San Francisco, then move calls to Ireland. This entails long days for employees in San Francisco and Ireland. Self-Service Support In today’s connected world, having one type of support option, such as phone or e-mail is not sufficient. Customers, both internal and external, expect support through varies means: Online user and support forums• Social media such as Twitter and Facebook• Online knowledge portal• Web and phone based• Remote screen sharing based• To operate at scale, enterprises have to adopt one or more of the previously mentioned methods. As the user base of a given application grows, different users turn to different methods of seeking help. The more you are able to empower your end users to solve their problem, the easier it will be on your organization to support them. Bug Reporting Tracking bugs and ensuring they are resolved helps in operating efficiently. This, in turn, enables large-scale operations. A few open source bug tracking tools include the following: Bugzilla (• http://www.bugzilla.org/) Mantis (• http://www.mantisbt.org/) Trac (• http://trac.edgewall.org/) Even for infrastructure-related issues, these bug tracking tools can be used in addition to ticketing tools to keep track of infrastructure issues. Or, you can use these bug reporting tools in lieu of ticketing tools, although you may lose some needed functionality that is specific to ticketing software and is not part of bug reporting software. Chapter 1 ■ Managing Large-SCaLe infraStruCture 17 Inventory Management Data center inventory management (DCIM) is a component of inventory management. Inventory management is generally used by finance to keep track of assets, and DCIM is useful for this; however, it adds an extra layer on top of infrastructure management. DCIM tools prove to be very helpful in the following scenarios: Capacity planning• Power requirements• Space planning• System provisioning• Contract negotiations• Hardware life cycle management• Older hardware procurement• New hardware procurement• When dealing with hardware vendors, it is helpful to be able to go to them with numbers on mean time between failures to negotiate with the vendor on better pricing or product improvements. Capacity planning in relation to power and space can be made a lot easier if an enterprise knows how many servers are present and how much power they consume. What kind of data should be present in a DCIM system? The more the merrier is the right policy in this case. At a minimum, the following items should be present: Server make, model, year purchased, cost, and a link to the purchase order• Storage, processors, memory, and any peripherals• Operating system flavor and version, and applications running on the system• Technical contact for the server administrator, and escalation contacts• Ethernet address, power supplies, and remote console information• A lot more information can be stored in a DCIM, and it should also integrate with your server provisioning system, such as Kickstart. To provision a system, a script can pull down the media access control address from the DCIM and create a PXE config file. Then, using the console information, you can reboot the system for operating system installation. Hardware The choice of hardware plays a very important role in the scaling of the infrastructure of an organization. Processors, memory, and storage technologies can make or break an application as well as a budget. In the following sections I try to make sense of the different choices available and how to pick solutions that suit your needs the most. Chapter 1 ■ Managing Large-SCaLe infraStruCture 18 Processors Between the two major vendors in the industry for server processors, Intel and AMD, there exist a wide variety of processor families. When purchasing hardware at scale, you have to be careful which processor family you buy into, and at what time of the processor cycle you are buying as well. If you look at the Intel microarchitecture (https://en.wikipedia.org/wiki/Intel_Tick-Tock), every 12 to 18 months there is either a new microarchitecture or a die shrink of the processor technology. You can capitalize on this and purchase hardware at different times of the cycle to optimize cost versus performance. In addition to cost, power consumption plays a key role in picking a processor architecture. A low-power processor that performs just as fast as a high-power processor might be more suitable for large-scale deployment to reduce the cost of running a data center. Low-power ARM-based (http://www.arm.com) processors are also gaining a market share. As of this writing, the current Intel microarchitecture is Haswell, and the upcoming one by the end of 2014 is called Broadwell. Haswell is a new architecture compared with the previous one, called Ivy Bridge. Broadwell is a die shrink of Haswell. Die shrink allows for cost reduction. When purchasing a server, take into consideration its microarchitecture and the stage of the microarchitecture. This affects your application performance and your financial costs. You cannot afford to ignore this factor when operating at scale. Memory Another aspect of operating at a large scale is to pick the right type of memory at the right price. Prices of memory vary a lot throughout the year. The right type of memory is dependent on the chipset of the motherboard on your server, and the requirements of the application. Yes, faster is better in this case, but cost is an important consideration and so is power consumption. There are different types of memory, such as DDR4 (1600 MHz–3200 MHz)• DDR3 PC3-6400, -8500, -10600, and -12800• DDR2 PC2-4200, -5300, -6400, and 8000• DDR PC1600, PC2100, PC2700, and PC3200• SDRAM PC100, 125 MHz, and PC133• As of this writing, DDR4 is the latest memory type; it is not backward compatible with DDR3. DDR4 has high module density compared with DDR3 and lower voltage requirements. If the cost of DDR4 is a concern, then consider using DDR3. The speed of DDR3 is shown in Table 1-1. Table 1-1. Memory Speeds Friendly name Industry name Peak transfer rate (MB/sec) Data transfers/sec (in millions) DDR3-800 PC3-6400 6400 800 DDR3-1066 PC3-8500 8533 1066 DDR3-1333 PC3-10600 10,667 1333 DDR3-1600 PC3-12800 12,800 1600 Chapter 1 ■ Managing Large-SCaLe infraStruCture 21 Listing 1-1. Tuned Profiles First, view all the different profiles available. These profiles apply specific kernel-level tuning parameters for network, disk, power, and memory to match performance to the requirements of the profile. # tuned-adm list Available profiles: - sap - virtual-guest - spindown-disk - default - server-powersave - latency-performance - enterprise-storage - laptop-ac-powersave - throughput-performance - laptop-battery-powersave - desktop-powersave - virtual-host Next, view which profile is active. In our case, a virtual host profile is active. This is probably a hypervisor, and hence this profile is active. # tuned-adm active Current active profile: virtual-host Service tuned: enabled, running Service ktune: enabled, running To select another profile, simply type in the profile name after the profile keyword. # tuned-adm profile throughput performance Reverting to saved sysctl settings: [ OK ] Calling '/etc/ktune.d/tunedadm.sh stop': [ OK ] Reverting to cfq elevator: dm-0 dm-1 dm-2 dm-3 dm-4 dm-5 dm[ OK ]dm-8 sda Stopping tuned: [ OK ] Switching to profile 'throughput-performance' Applying deadline elevator: dm-0 dm-1 dm-2 dm-3 dm-4 dm-5 d[ OK ] dm-8 sda Applying ktune sysctl settings: /etc/ktune.d/tunedadm.conf: [ OK ] Calling '/etc/ktune.d/tunedadm.sh start': [ OK ] Applying sysctl settings from /etc/sysctl.d/libvirtd Applying sysctl settings from /etc/sysctl.conf Starting tuned: [ OK ] If you want to view all the different settings that have been changed for a given profile, you can do so by looking at the /etc/tune-profiles/ directory, which lists each profile and the settings that are changed based on the profile. Chapter 1 ■ Managing Large-SCaLe infraStruCture 22 Tuning TCP/IP In addition to using system profiles, you should look at TCP parameters on your systems and determine whether there is a need to tune them. As part of provisioning systems, pretuned systems help maintain a large-scale network with fewer issues. Linux maintains a buffer for TCP/IP packets. The buffer size is adjusted dynamically; however, you can set some limits on it. There are two different parameters that can be adjusted. One is a receive window and the other is a send window. The receive window is the amount of data a recipient is willing to accept. The send window is the amount of data the sender can receive at a given time. Listing 1-2 shows how to view and adjust these TCP parameters. Listing 1-2. Tunning TCP Inspect the TCP socket buffer status on the system using netstat. If we see packets that are pruned or collapsed, we will have to adjust the TCP send and receive window to prevent the pruning. In this example, we can see 70 packets were pruned and 9325 packets were collapsed. # netstat -s | grep socket 3118 resets received for embryonic SYN_RECV sockets 70 packets pruned from receive queue because of socket buffer overrun 75756 TCP sockets finished time wait in fast timer 23 delayed acks further delayed because of locked socket 9325 packets collapsed in receive queue due to low socket buffer Review the existing size of the rmem and wmem parameters. rmem is the receive window and wmem is the send window. The first number is the smallest the buffer gets, the middle number is the default size with which a socket is opened, and the last number is the largest the buffer gets. # cat /proc/sys/net/ipv4/tcp_rmem 4096 87380 4194304 # cat /proc/sys/net/ipv4/tcp_wmem 4096 16384 4194304 Let’s increase the size and reset the counters for netstat. For servers with 1GB or 10GB network cards, the maximum size of the buffers is 16MB. To reset the counters, you have to reboot the system. To preserve these settings across reboot, enter them in /etc/sysctl.conf. # echo "4096 87380 16777216" > /proc/sys/net/ipv4/tcp_wmem # echo "4096 87380 16777216" > /proc/sys/net/ipv4/tcp_rmem # cat /proc/sys/net/ipv4/tcp_rmem 4096 87380 16777216 # cat /proc/sys/net/ipv4/tcp_wmem 4096 87380 16777216 Increasing these values to a large number can result in high latency and jitter (packet delay variation), also known as buffer bloat (https://en.wikipedia.org/wiki/Bufferbloat). The overall throughput of a network can be reduced because of buffer bloat. Applications such as VoIP are very sensitive to jitter. An application that can be used to check for buffer bloat is ICSI Netalyzr (http://netalyzr.icsi.berkeley.edu/). Chapter 1 ■ Managing Large-SCaLe infraStruCture 23 CPU Scheduling CPU scheduling is the process of scheduling jobs on the processor. The kernel’s job is to ensure the CPU is as busy as possible. There are two scheduling categories: Real time• • SCHED_FIFO • SCHED_RR Normal• • SCHED_OTHER • SCHED_BATCH • SCHED_IDLE Real-time threads are scheduled first using one of the real-time CPU schedulers. The normal scheduler is used for all threads that do not need real-time processing. In large-scale computing, you might have a need to adjust the CPU scheduling policy. You can check if an adjustment is needed by keeping a tab on the nonvoluntary_ctxt_switches parameter, as shown in Listing 1-3. Here we are checking the context switching of the init process that has a process ID (PID) of 1. You can replace the PID with any other running PID on the system to get information about the context switching of that PID. Listing 1-3. R eviewing CPU Scheduling # grep voluntary /proc/1/status voluntary_ctxt_switches: 136420 nonvoluntary_ctxt_switches: 59 If the nonvoluntary context switches are high, then you might want to consider changing the scheduler. For data throughput relating to network bandwidth and disk input/output, use the SCHED_OTHER scheduler. If latency is a concern, then use SCHED_FIO. Latency is defined as event response time. It is different than throughput, because the goal of throughput is to send as much data as possible, but latency is sending as fast as possible. Normal policies result in better throughput than real-time policies because they do not preempt processes, as the real-time scheduler might do to ensure real-time processing. The command chrt (http://linux.die.net/man/1/chrt) can be used to manipulate the scheduling policy for a process. Listing 1-4 shows how to manipulate the CPU scheduling for a process. Listing 1-4. Modifying CPU scheduling View the existing priority of PID 1 (init). # chrt -p 1 pid 1's current scheduling policy: SCHED_OTHER pid 1's current scheduling priority: 0 Each scheduling policy has limits. We can use the -m option to view them. Larger numbers mean greater priority. Chapter 2 ■ hosted Cloud solutions using google Cloud platform 26 Private Cloud Selecting a private cloud entails building a cloud infrastructure in your own data center. This option can be expensive, depending on how much investment is required and the size of your organization. OpenStack, for instance, is a distributed, in-house cloud solution that requires an extensive investment not only in hardware, but also in engineering skills to manage the solution. The following is a brief list of topics to keep in mind when deciding to opt for an in-house private cloud. Vendor choice• Hardware requirements• Availability of engineers who can manage the cloud• Integration with your existing environment• Ongoing licensing costs, in some cases• The ease/difficulty of managing the cloud• Life cycle management of the cloud• Total cost of ownership of the private cloud• The exercise of hosting your own cloud can be simplified if you have a robust infrastructure with the tools needed to deploy new hardware quickly. There are numerous private cloud software solutions; I listed two earlier—OpenStack, which is free, and VMware, for which you must pay. Although OpenStack is free, this does not mean the cost of implementing it is less than that of VMware. Keep in mind that you should always look at the total cost of ownership when considering your options. A private cloud architecture is shown in Figure 2-1. Figure 2-1. Private cloud solution Public Cloud A public cloud solution can end up being more expensive than a private cloud or less, depending on the size of an enterprise. Some cloud providers such as Amazon and Google provide a cost calculator. Using this calculator you can determine the approximate monthly cost of hosting your infrastructure on the provider’s premises. Amazon’s calculator can be found at https://calculator.s3.amazonaws.com/index.html and Google’s calculator is located at https://cloud.google.com/products/calculator/. Microsoft has its own cloud platform called Azure, and Salesforce.com is a market leader in the SaaS cloud solution provider space. Figure 2-2 gives an example of a public cloud. Chapter 2 ■ hosted Cloud solutions using google Cloud platform 27 Hybrid Cloud Hybrid cloud solutions include an in-house cloud solution and an external public cloud, as shown in Figure 2-3. This type of solution can be used for multiple purposes. For instance, one use could be disaster recovery. If the in-house cloud fails, the public cloud takes over serving data. Another alternative is to use the in-house cloud for development and staging, and the public cloud for production release. Hybrid clouds can also help save money, because you may end up hosting only a production infrastructure in the pay-per-use public cloud and use a smaller scale, less expensive in-house cloud for nonproduction environments such as development, QA, and staging. As mentioned, OpenShift from RedHat is a hybrid cloud solution; it has an on-premise and hosted infrastructure. The advantage of using an integrated solution such as OpenShift is that you only need to develop once, for a single platform, and you can deploy both on-premise and in a public cloud using the same deployment engine. Figure 2-2. Public cloud solution Figure 2-3. Hybrid cloud solution Components of a Cloud Cloud platforms—private, public, or hybrid—have some common components, including the following: Compute: virtual machines or instances that are available to install applications on and • use for computing App Engine: application containers to upload code into the cloud; runs in the cloud without • you having to manage any virtual machine or instances Storage: object base storage or nonrelational database storage• Databases: MySQL or something similar• Chapter 2 ■ hosted Cloud solutions using google Cloud platform 28 Application programming interface (API): used to access cloud components, such as compute, • app, and storage Metering: a way of tracking usage and either increasing or decreasing usage based on certain • metrics; also used for billing Networking: used for communication between the cloud based virtual machines, and from • external sources into the cloud An enterprise does not necessarily have to use all the components of a cloud. One or more components may be in the cloud, but others are outside the cloud. Similarly, one or more components may be in the public cloud but other components reside in the private cloud. Migrating to the Cloud Moving applications to the cloud can be a daunting task. Assuming that an enterprise does not have a cloud solution but has decided to adopt one, how does it go about migrating existing infrastructure applications to the cloud? In this section I explain a few things you should keep in mind to have a successful migration. First, you should conduct a cost analysis. Using any of the cloud calculators I mentioned earlier, calculate the cost of using a cloud such as Google Cloud. Then, compare the cost with that of using an in-house cloud such as OpenStack. To calculate the cost of OpenStack you have to include items such as hardware, both server and network, engineers, and support costs. A word of caution about online cloud cost calculators is that your answers may vary between the costs they predict and the actual cost you end up paying. Factor in a margin of error with the online calculators. The next step is to figure out which applications are good candidates to run in the cloud. Not all applications run well in the cloud; some may be better running in-house. There could be regulations in certain industries that prohibit using external cloud providers. For instance, if you are in the health care industry, then privacy concerns may restrict you from uploading data to an external cloud. Another concern with privacy may be in regard to the laws of a country. For instance, Germany has very strict individual privacy laws that may prohibit German companies from uploading German customer data to an international cloud providers network. Next, it is crucial to develop a project plan that outlines which applications are moving and when. One possible way of migrating existing applications is to use the existing code base and push it in the cloud. This is perhaps not the most ideal way, because legacy applications not designed to run in the cloud may not work well in the cloud. For instance, if your application needs a database such as MySQL, MySQL may not be available in the cloud and you may have to use another database. Another thing to keep in mind is that some applications are very sensitive to disk input/output speed requirements and, because a majority of the cloud is based on virtualization, it will be slower than having direct disk access. Therefore these types of applications may not work well in the cloud. There are some cloud providers that lease physical servers instead of virtual environments, but at this point it’s hard to see an advantage in using hosted physical servers. As an application developer you also have to decide which cloud components are needed by an application. For enterprise applications that are within a corporate infrastructure, there are a host of other infrastructure services, such as monitoring, backup, and network operations center (NOC), that provide support services. These are not available in the same format in the cloud. For instance, Google Cloud’s SQL backup policy, which can be found at https://developers.google.com/ cloud-sql/docs/backup-recovery, states that last seven backups of an instance are stored without any charge. If your organization has a need to back up data longer than that, then how can it be done and the cost of doing so must be considered. For monitoring applications, some cloud providers provide a framework that can be used programmatically, and alerts can be sent on certain trigger activations. Google Cloud provides an API (https://developers.google. com/cloud-monitoring/) that lets you read monitoring metrics such as CPU, disk usage, and more, and sends alerts based on those metrics. An enterprise has to decide whether it makes sense to use this API or to leverage its existing monitoring infrastructure and extend it into the cloud. Chapter 2 ■ hosted Cloud solutions using google Cloud platform 31 Storage• Google Cloud Storage (GCS)• Google Cloud DataStore (GCD)• Cloud SQL (GSQL)• BigQuery• Development tools• Google Cloud SDK• Cloud Playground• Google Plugin for Eclipse• Push to Deploy• Android Studio• Accessing GCP is done through the web interface at https://cloud.google.com. There is also a software development kit (SDK) that can be used to manage GCP; however, some level of web user interface is required, at least initially, to set up your project. You can view extensive documentation on this above topics at https://developers.google.com/cloud/. Projects Projects in GCP are a way of grouping resources. For instance, if there are ten applications, each might get its own project. The name of a project is entirely up to you. Each project has a project name and a project ID. The project name can be changed later, but the project ID cannot be changed. A project also has a project number that is assigned automatically and cannot be changed. For instance, if your company is Example.com, then the project might be named Example-Web-Front-End, with a project ID of example-web-fe. The project ID also is the subdomain in the .appspot.com URL for GAE. All projects have permissions and billing. Without enabling billing, you cannot create any resources in the project. Providing a credit card number enables billing, as does providing a bank account number. Premier accounts do not need a credit card account; instead, they are billed on a monthly basis. Enterprises generally sign up with a premier account when they know for sure that they want to deploy in GCP. One possible way of handling projects is to create three projects per application: one for development, another for QA or staging, and the third for production. Developers can be given access to the developer project, QA engineers to the QA project, and operations staff to the production project. Permissions The permissions in Google Cloud Platform are all based on Google accounts, which are used to log in to GCP, and are applied in projects. There are three kinds of permissions in a project. “Is owner” allows full access, which includes billing and administrative control. “Can edit” allows full access to the applications, but not to billing and administrative control. “Can view” allows view permission without the ability to modify any setting. Permissions are crucial to a project, and having unnecessary permissions can result in a security breach, so be extra careful when providing permissions. An example is that most projects should have different permissions for cloud management engineers versus developers versus the NOC. However, if all three categories of people are given the same level of access, then one group may override the settings put in place by another group, causing virtual instances of the project not to be accessible. You can read more about permissions at https://cloud.google.com/ developers/articles/best-practices-for-configuring-permissions-on-gcp. Chapter 2 ■ hosted Cloud solutions using google Cloud platform 32 Google Compute Engine GCE provides virtual machines, also known as instances that can be used to deploy code. They consist of disks, images, instances, networks, load balancers, and more. GAE is an alternative to GCE for certain types of applications. With GAE, you do not have a lot of the flexibility present in GCE. For instance, in GCE you can control networks, firewalls, load balancers, and virtual machine configuration—none of which is feasible in GAE. If you need this level of control over your application, then use GCE and not GAE. You can get more information about GCE at https://developers.google.com/compute/. Virtual Machines To create a virtual machine in GCE, you can use the Google Cloud SDK or the web interface. There are different configurations available for virtual machines. A full list can be found at https://cloud.google.com/products/ compute-engine/. Pricing varies based on instance type. Listing 2-1 is an example of creating a virtual machine using the default settings. Listing 2-1. Create a Virtual Machine ########## #Install the Google Cloud SDK $ curl https://sdk.cloud.google.com | bash ########## #Log in to GCP; this will open a web browser that will ask you to give permission for GCP $ gcloud auth login ########## #Add an instance to project webexample $ gcutil --project=webexample addinstance www1-example-com Select a zone: 1: asia-east1-a 2: asia-east1-b 3: asia-east1-c 4: europe-west1-a 5: europe-west1-b 6: us-central1-a 7: us-central1-b 8: us-central1-f >>> 6 Select a machine type: 1: n1-standard-1 1 vCPU, 3.75 GB RAM 2: n1-standard-16 16 vCPUs, 60 GB RAM 3: n1-standard-2 2 vCPUs, 7.5 GB RAM 4: n1-standard-4 4 vCPUs, 15 GB RAM 5: n1-standard-8 8 vCPUs, 30 GB RAM 6: n1-highcpu-16 16 vCPUs, 14.4 GB RAM 7: n1-highcpu-2 2 vCPUs, 1.8 GB RAM 8: n1-highcpu-4 4 vCPUs, 3.6 GB RAM 9: n1-highcpu-8 8 vCPUs, 7.2 GB RAM 10: n1-highmem-16 16 vCPUs, 104 GB RAM 11: n1-highmem-2 2 vCPUs, 13 GB RAM Chapter 2 ■ hosted Cloud solutions using google Cloud platform 33 12: n1-highmem-4 4 vCPUs, 26 GB RAM 13: n1-highmem-8 8 vCPUs, 52 GB RAM 14: f1-micro 1 vCPU (shared physical core) and 0.6 GB RAM 15: g1-small 1 vCPU (shared physical core) and 1.7 GB RAM >>> 3 Select an image: 1: projects/centos-cloud/global/images/centos-6-v20140718 2: projects/debian-cloud/global/images/backports-debian-7-wheezy-v20140807 3: projects/debian-cloud/global/images/debian-7-wheezy-v20140807 4: projects/rhel-cloud/global/images/rhel-6-v20140718 5: projects/suse-cloud/global/images/sles-11-sp3-v20140712 >>> 1 INFO: Waiting for insert of instance www1-example-com. Sleeping for 3s. [SNIP] Table of resources: +------------------+-------------+----------------+---------------+---------+ | name | network-ip | external-ip | zone | status | +------------------+-------------+----------------+---------------+---------+ | www1-example-com | 10.240.93.2 | 146.148.37.192 | us-central1-a | RUNNING | +------------------+-------------+----------------+---------------+---------+ Table of operations: -------------------------------------+----------+-------------------------------+---------------+ | name | status | insert-time | operation-type| -------------------------------------+----------+-------------------------------+---------------+ | operation-140790998[SNIP]-4641f474 | DONE | 2014-08-12T23:06:27.376-07:00 | insert | +------------------------------------+----------+-------------------------------+---------------+ ########## #SSH to the instance created earlier $ gcutil --project webexample ssh www1-example-com The type of instance you create depends on the computing requirements of the application. Be careful when picking the type of instance, because if you overestimate you will end up paying for capacity that is not being used. If you underestimate, you can always add more instances. For example, instance type n1-standard-4 consists of four virtual cores, 15GB of memory, and costs $0.280 in the United States. If you pick n1-standard-16 instead, then the cost is $1.120/hour for 16 cores and 64GB of memory. You can figure out whether you are logged in a GCE virtual instance by searching for “Google” in the output of the dmidecode command: $ sudo dmidecode -s bios-vendor | grep Google Google Chapter 2 ■ hosted Cloud solutions using google Cloud platform 36 Regions and Zones GCE instances can be deployed in various regions and zones. Resources such as disk, instance, and IP address are zone specific. For example, an instance in us-central1-a cannot use a disk from us-central-1b. Regions are collections of zones. For instance us-central1 is a region that consists of zones a,b, and f. Two sample zones that are available in GCE as of this writing are depicted in Figure 2-6. Figure 2-6. Sample GCE regions and zones You can view all the zones available using the command shown here: $ gcloud compute zones list https://www.googleapis.com/compute/v1/projects/webexample/zones/asia-east1-a https://www.googleapis.com/compute/v1/projects/webexample/zones/asia-east1-c https://www.googleapis.com/compute/v1/projects/webexample/zones/asia-east1-b https://www.googleapis.com/compute/v1/projects/webexample/zones/europe-west1-a https://www.googleapis.com/compute/v1/projects/webexample/zones/europe-west1-b https://www.googleapis.com/compute/v1/projects/webexample/zones/us-central1-a https://www.googleapis.com/compute/v1/projects/webexample/zones/us-central1-f https://www.googleapis.com/compute/v1/projects/webexample/zones/us-central1-b To come up with an enterprise strategy for which zones to deploy in, the recommendation from Google is to spread instances between at least two zones. If one zone has an issue, your application will be up if it has another copy running in another zone. If the enterprise is a U.S.-only enterprise, then select regions that are based in the United States. On the other hand, if the enterprise is a global, then select regions where the enterprise does business. Tip ■ for fault tolerance, when creating instances, create at least two of each in different zones. Support for processors also varies across zones. For instance, as of this writing Intel Sandy Bridge is not available in us-central1-f. When you view the processor information of a virtual machine, you will see the type of processor that has been provided. An example is shown here for a dual-core virtual machine: $ egrep 'vendor_id|model name' /proc/cpuinfo vendor_id : GenuineIntel model name : Intel(R) Xeon(R) CPU @ 2.60GHz vendor_id : GenuineIntel model name : Intel(R) Xeon(R) CPU @ 2.60GHz You can read more about regions and zones at https://developers.google.com/compute/docs/zones. Chapter 2 ■ hosted Cloud solutions using google Cloud platform 37 Quotas Because GCE is a shared environment, Google implements quotas to ensure that no single customer is able to use all the resources and affect another customer. There are two kinds of quotas: one is projectwide quota and the other is regionwide quota. Quotas are on resources such as static IP address, images, networks, and firewall rules. Project quota limitations are listed in Table 2-1. Table 2-1. GCE Quotas Resource Limit Firewalls 100 Forwarding rules 50 Health checks 50 Images 100 Networks 5 Routes 100 Snapshots 1000 Target pools 50 To view a list of networks being used in your project, use the following command: $ gcloud compute networks list NAME IPV4_RANGE GATEWAY_IPV4 default 10.240.0.0/16 10.240.0.1 corp-net 192.168.0.0/16 192.168.0.1 Caution ■ not being mindful of quotas on projects can result in production outages, especially for autoscaling resources. For an enterprise, it is crucial to get the right amount of resources—both for projects and for regions. If the project is using autoscaling and growing instances on demand, hitting a quota limitation may affect production. You can request Google to increase the quota by using the web interface. It may take them a few days to do so, so plan ahead. The form to request a quota increase is at https://docs.google.com/forms/d/1vb2MkAr9JcHrp6myQ3oTxCyBv2c7Iy c5wqIKqE3K4IE/viewform. Monitor your usage and the quota limit closely, and make the request at least a few days in advance. You can view a list of quotas limitations at https://developers.google.com/compute/docs/resource-quotas. Firewalls By default, all networks in GCE are protected by a firewall that blocks incoming traffic. You can connect from an external network only to those GCE instances that have a public IP address. If an instance does not have an external IP and you attempt to SSH to it, you will get an error such as the one shown here: ERROR: (gcloud.compute.ssh) Instance [vm1] in zone [us-central1-b] does not have an external IP address, so you cannot SSH into it. To add an external IP address to the instance, use [gcloud compute instances add-access-config]. Chapter 2 ■ hosted Cloud solutions using google Cloud platform 38 You can access instances within a GCE network that do not have an external IP from other instances within GCE, because GCE uses the internal address of that host. For instance, if you are on web-server2 and you ping web-server1, GCE will use the internal address of web-server1: [web-server2 ~]$ ping web-server1 -c 1 PING web-server1.c.webexample.internal (10.240.107.200) 56(84) bytes of data. 64 bytes from web-server1.c.webexaple.internal (10.240.107.200): icmp_seq=1 ttl=64 time=0.642 ms --- web-server1.c.webexample.internal ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.642/0.642/0.642/0.000 ms If you have one or more web servers behind a GCE load balancer, you do not have to assign an external IP address to the web server instances for them to accept incoming traffic. When you create the load balancer, GCE assigns the load balancer a public IP address, which is where all incoming traffic arrives. The load balancer then forwards the traffic to the web servers. On the other hand, if the instance is not behind a GCE load balancer, then it has to have a public IP address for it to accept incoming connections from the Internet. For enterprises, assigning public IP addresses to instances is not recommended for security reasons, unless needed. You can create a gateway host, SSH to which is allowed from the corporate network. From the gateway host you can then SSH to all other instances within GCE. Iptables is not configured by default on GCE instances; however, it can be used in addition to the network firewall that GCE provides for added security, as shown in Listing 2-2. Listing 2-2. GCE Firewall ##########\ #list the firewall rules associated with all networks in a given project of GCE #we have just one network called 'default' $ gcloud compute firewall-rules list NAME NETWORK SRC_RANGES RULES SRC_TAGS TARGET_TAGS default-allow-internal default 10.0.0.0/8 tcp:1-65535,udp:1-65535,icmp default-https default 0.0.0.0/0 tcp:443 default-ssh default 0.0.0.0/0 tcp:22 http default 0.0.0.0/0 tcp:80 ########## #view detailed information about the http rule $ gcloud compute firewall-rules describe http allowed: - IPProtocol: tcp ports: - '80' creationTimestamp: '2013-12-18T18:25:30.514-08:00' id: '17036414249969095876' kind: compute#firewall name: http network: https://www.googleapis.com/compute/v1/projects/webexample/global/networks/default selfLink: https://www.googleapis.com/compute/v1/projects/webexample/global/firewalls/http sourceRanges: - 0.0.0.0/0 Chapter 2 ■ hosted Cloud solutions using google Cloud platform 41 You can also create your own image and use that to deploy instances, as seen in Listing 2-4. Listing 2-4. Creating a Custom Image ########## #create an instance called web-server, using the centos-6 image, in zone us-central1-b #we specify the scope of storage-rw and compute-rw to allow access to Google Cloud Storage #you can learn more about scopes using the command 'gcloud compute instances create --help' $ gcloud compute instances create web-server --scopes storage-rw compute-rw --image centos-6 --zone us-central1-b Created [https://www.googleapis.com/compute/v1/projects/webexample/zones/us-central1-b/instances/ web-server]. NAME ZONE MACHINE_TYPE INTERNAL_IP EXTERNAL_IP STATUS web-server us-central1-b n1-standard-1 10.240.111.65 108.59.82.59 RUNNING ########## #SSH to the instance we created earlier #install httpd $ gcloud compute ssh web-server --zone us-central1-b [web-server ~]$ sudo yum install httpd -y Loaded plugins: downloadonly, fastestmirror, security Determining fastest mirrors * base: mirror.us.oneandone.net * extras: mirror.wiredtree.com * updates: centos.corenetworks.net .....[SNIP]..... Installed: httpd.x86_64 0:2.2.15-31.el6.centos .....[SNIP]..... Complete! [web-server ~]$ exit logout Connection to 108.59.82.59 closed. ########## #delete the instance while keeping the boot disk, because we don't need the instance anymore #you can also keep the instance for future modifications if you want $ gcloud compute instances delete web-server --keep-disks boot --zone us-central1-b The following instances will be deleted. Attached disks configured to be auto-deleted will be deleted unless they are attached to any other instances. Deleting a disk is irreversible and any data on the disk will be lost. - [web-server] in [us-central1-b] Do you want to continue (Y/n)? y Updated [https://www.googleapis.com/compute/v1/projects/webexample/zones/us-central1-b/instances/ web-server]. Deleted [https://www.googleapis.com/compute/v1/projects/webexample/zones/us-central1-b/instances/ web-server]. Chapter 2 ■ hosted Cloud solutions using google Cloud platform 42 ########## #now create a new image called web-server-image using the boot disk of the previous instance on which we installed httpd #the --source-disk provides the name of web-server, because that was the name of the disk from the earlier instance $ gcloud compute images create web-server-image --source-disk web-server --source-disk-zone us-central1-b Created [https://www.googleapis.com/compute/v1/projects/webexample/global/images/web-server-image]. NAME PROJECT DEPRECATED STATUS web-server-image webexample READY ########## #list the images and make sure you see the web-server-image $ gcloud compute images list NAME PROJECT DEPRECATED STATUS web-server-image webexample READY centos-6-v20140718 centos-cloud READY .....[SNIP]..... rhel-6-v20140718 rhel-cloud READY sles-11-sp3-v20140712 suse-cloud READY ########## #attempt to create a new instance called web-server2 using the web-server-image we created earlier $ gcloud compute instances create web-server2 --image web-server-image --zone us-central1-b Created [https://www.googleapis.com/compute/v1/projects/webexample/zones/us-central1-b/instances/ web-server2]. NAME ZONE MACHINE_TYPE INTERNAL_IP EXTERNAL_IP STATUS web-server2 us-central1-b n1-standard-1 10.240.84.183 108.59.82.59 RUNNING ########## #once the instance is up, SSH to it and verify that httpd is present on it $ gcloud compute ssh web-server2 --zone us-central1-b [web-server2 ~]$ hostname web-server2 [web-server2 ~]$ rpm -qa | grep -i httpd httpd-tools-2.2.15-31.el6.centos.x86_64 httpd-2.2.15-31.el6.centos.x86_64 [web-server2 ~]$ exit logout Connection to 108.59.82.59 closed. Network Load Balancing Load balancing in GCE is straightforward. It does not support multiple load-balancing algorithms as of this writing. The algorithm supported is protocol based; in other words, it is based on address, port, and protocol type. By default, GCE picks real servers based on a hash of the source IP and port, and the destination IP and port. Incoming connections are spread across the real servers, not by packet, but by connection. For instance, if there are three real servers and a connection is to a given real server, until the connection is closed, all packets for that connection will go to the same server. Chapter 2 ■ hosted Cloud solutions using google Cloud platform 43 Note ■ google does not support different load-balancing algorithms, such as least connection, dynamic round robin, weighted round robin, predictive, and observed. To set up load balancing, first create a target pool. The pool should consist one or more real servers to which traffic will be sent. A pool can contain instances in different zones as long as they are in the same region. After creating a pool, create a forwarding rule that forwards traffic to the previously created target pool. The type of traffic that is forwarded can be TCP or UDP and you can specify a range of ports. Last, open the ports being forwarded on the firewall that are going to the real servers so that traffic can flow to them. If encryption is being done, then it has to be set up at the instance level; the load balancer does not do any decryption for you. You cannot terminate a secure sockets layer at the load balancer and expect it to communicate unencrypted with the real servers. The general steps are summarized here and are shown in Listing 2-5: 1. Create web server instances. 2. Install httpd, and start httpd on servers. 3. Tag the web servers with a tag such as www. 4. Create a firewall rule to allow HTTP traffic to target tag www. 5. Verify that you can access the web servers remotely on port 80. 6. Create a health check on port 80 for the load balancer. 7. Define a target pool and add the two instances to the pool. 8. Create a load balancer forwarding rule to forward http port 80 traffic to the previously created target pool. 9. Ensure the forwarding rule works. Listing 2-5. Load-Balanced Web Server Deployment ############### #view a list of instances in our project #it looks like we have only one instance called web-server2 $ gcloud compute instances list NAME ZONE MACHINE_TYPE INTERNAL_IP EXTERNAL_IP STATUS web-server2 us-central1-b n1-standard-1 10.240.84.183 108.59.82.59 RUNNING ############### #because we want to test load balancing, we are going to create another web server #the second web server will be called web-server1, because the earlier one is called web-server2 #this web server is going to be in another zone, in the same region, for fault tolerance in case one zone goes down $ gcloud compute instances create web-server1 --image web-server-image --zone us-central1-a Created [https://www.googleapis.com/compute/v1/projects/webexample/zones/us-central1-a/instances/ web-server1]. NAME ZONE MACHINE_TYPE INTERNAL_IP EXTERNAL_IP STATUS web-server1 us-central1-a n1-standard-1 10.240.107.200 199.223.235.248 RUNNING Chapter 2 ■ hosted Cloud solutions using google Cloud platform 46 ############### #add instances to the pool www-pool $ gcloud compute target-pools add-instances www-pool --instances web-server1 --zone us-central1-a Updated [https://www.googleapis.com/compute/v1/projects/webexample/regions/us-central1/targetPools/ www-pool]. $ gcloud compute target-pools add-instances www-pool --instances web-server2 --zone us-central1-b Updated [https://www.googleapis.com/compute/v1/projects/webexample/regions/us-central1/targetPools/ www-pool]. ############### #create a forwarding rule in the load balancer $ gcloud compute forwarding-rules create www-rule --region us-central1 --port-range 80 --target-pool www-pool Created [https://www.googleapis.com/compute/v1/projects/webexample/regions/us-central1/ forwardingRules/www-rule]. NAME REGION IP_ADDRESS IP_PROTOCOL TARGET www-rule us-central1 173.255.119.47 TCP us-central1/targetPools/www-pool ############### #check the forwarding rule $ gcloud compute forwarding-rules describe www-rule --region us-central1 IPAddress: 173.255.119.47 IPProtocol: TCP creationTimestamp: '2014-08-15T17:26:05.071-07:00' id: '11261796502728168445' kind: compute#forwardingRule name: www-rule portRange: 80-80 region: https://www.googleapis.com/compute/v1/projects/webexample/regions/us-central1 selfLink: https://www.googleapis.com/compute/v1/projects/webexample/regions/us-central1/ forwardingRules/www-rule target: https://www.googleapis.com/compute/v1/projects/webexample/regions/us-central1/targetPools/ www-pool ############### #we can use curl to check the web server #the reason we see web-server1 and web-server2 is that the load balancer is sending the requests to each of the #web servers $ while true; do curl -m1 173.255.119.47; done web-server1 web-server2 web-server1 web-server2 web-server2 web-server2 You can read more about network load balancing in the GCE world at https://developers.google.com/ compute/docs/load-balancing/network/. Chapter 2 ■ hosted Cloud solutions using google Cloud platform 47 Maintenance Google performs scheduled maintenance on the GCE infrastructure periodically. Maintenance can be transparent, which is within a zone, or it can affect an entire zone. In case of transparent maintenance, instances are moved between hypervisors, and as such you may not notice it. The movement of the instance may result in minor performance degradation. For complete zone maintenance, instances are not moved onto another zone and therefore are shut down. If your application is running in only a single zone, then it will be down during the maintenance window, which can be two weeks. By default, Google live-migrates instances during a scheduled maintenance window. You can set an instance to terminate and restart during the maintenance instead of using live migration. If this option is set, then Google sends a signal to the instance to shut down. After that, Google terminates the instance and then performs the scheduled maintenance. After maintenance is complete, the instance is powered back on. You can view the operations performed on instances in a zone that includes maintenance operations by using the operations list command as shown here: $ gcloud compute operations list --zones us-central1-b NAME TYPE TARGET HTTP_STATUS STATUS operation-..[SNIP]..-9e593195 delete us-central1-b/instances/vm1 400 DONE operation-..[SNIP]..327480ef delete us-central1-b/instances/vm2 200 DONE operation-..[SNIP]..-33aa0766 reset us-central1-b/instances/vm3 200 DONE systemevent-..[SNIP]..tances.migrateOnHostMaintenance ..[SNIP]..vm4 200 DONE You can also configure GCE to restart an instance automatically if it crashes. This can be done through the web console or the API. As an enterprise strategy, using transparent maintenance has an advantage, because you do not have to worry about an instance being shut down during maintenance. In addition, if you enable the autorestart feature of an instance, in case the instance crashes, it comes back online automatically. You can read more about Google’s maintenance policy at https://developers.google.com/compute/docs/robustsystems. Google Cloud Storage So far we have looked at GCE, or Google Compute Engine, which is an instance-based environment for building a cloud infrastructure. In addition to the compute environment, Google also provides a robust storage environment. The components included in cloud storage are Google Cloud Storage (GCS), Google Cloud DataStore (GCD), Cloud SQL (GSQL), and BigQuery. Google Cloud Storage (https://cloud.google.com/products/cloud-storage/) is an object store service. Access is available through an API and also the GCS web interface. There are two types of storage solutions in GCS: standard storage or durable reduced availability (DRA). DRA is suitable for backups and batch jobs, because some unavailability should be acceptable for DRA applications. DRA costs less than standard storage. The price difference between the two can be viewed at https://developers.google.com/storage/pricing#storage-pricing. Google Cloud SQL (https://developers.google.com/cloud-sql/) is a relational database, similar to MySQL. It is instance based and offers automatic backups and replication as well. Google Cloud Datastore (https://developers.google.com/datastore/) is a managed, schema-less database for storing nonrelational data. It is a NoSQL database that is highly scalable and reliable. For cases when GSQL will not do, GCD might be a better option. Google BigQuery (https://developers.google.com/bigquery/) is a data analytical environment, not a data store. You can bulk upload data from GCS or stream it in. There is a browser interface, command line, and API access. The choice of a storage solution depends entirely on the application. If a data center inventory management system is being developed and the data are suitable for a relational database, GSQL might be a good solution. On the other hand, if a social networking application is being developed, GCD might be a better solution because of the volatility of the data. Chapter 2 ■ hosted Cloud solutions using google Cloud platform 48 Google App Engine Google App Engine (https://developers.google.com/appengine/) is a PaaS environment that lets you upload applications to the cloud and run them on Google’s infrastructure. Unlike GCE, there are no compute instances to maintain; you simply upload your application and Google runs the application. GAE supports Java, Python, PHP, and Go. To develop applications for GAE, you can download the SDK (https://developers.google.com/appengine/downloads) and then start writing code, which, once uploaded into GAE using the SDK, can be run. GAE integrates well with other Google Cloud solutions, such as GCE and GCS. For an enterprise to invest in GAE, it is crucial to understand which applications are likely candidates. Security is a huge concern with GAE, because a poorly written application can cause a data leak on the Internet. Authentication, authorization, and encryption are key components of a successful GAE deployment strategy. An in-depth discussion of GAE is out of scope for this chapter, because there is no infrastructure to run or manage. Deployment Tools There are numerous tools available to interact with GCP. Cloud Playground (https://code.google.com/p/cloud- playground/) is a quick way to try different Google cloud services without downloading the SDK. Google Plugin for Eclipse (https://developers.google.com/eclipse/) lets you interact with GCP from within Eclipse. It is a very useful tool for developers who want to upload code to GCP from within an integrated development environment. Push to Deploy (https://developers.google.com/cloud/devtools/repo/push-to-deploy) is a way to deploy code into GCP by pushing to a Git repository. This method is part of a “release pipeline” that saves the effort of uploading to GCP using the SDK. Android Studio (https://developer.android.com/sdk/installing/studio.html), which is an Android development environment, also supports a back end for GCP. This makes it easy to test and deploy to a GCP back end for Android applications (https://developers.google.com/cloud/devtools/android_studio_templates/). Google Cloud SDK The SDK is an essential tool for managing GCE. As mentioned earlier, the way to install the SDK is by using curl: curl https://sdk.cloud.google.com | bash When an update is available for the SDK, you will see a message similar to that shown here anytime you attempt to use the SDK: There are available updates for some Cloud SDK components. To install them, please run: $ gcloud components update To update the SDK, run the following command: $ gcloud components update The following components will be updated: -------------------------------------------------------------------------------------------- | App Engine Launcher Application for Mac | 1.9.7 | 7.3 MB | | App Engine SDK for Java | 1.9.7 | 153.1 MB | | App Engine SDK for Java (Platform Specific) | 1.9.6 | < 1 MB | | BigQuery Command Line Tool | 2.0.18 | < 1 MB | [SNIP] Do you want to continue (Y/n)? Y Chapter 2 ■ hosted Cloud solutions using google Cloud platform 51 Conclusion As of the this writing , cloud computing does not appear to be a trend, but a viable way of running applications. There are numerous decisions to be made with respect to cloud computing, and I hope the information in this chapter helps you make those critical decisions. Whether it’s an in on-premise cloud, a hosted cloud, or a combination of both, investing in a cloud can help an enterprise reduce time to deployment for production applications. Furthermore, many corporations have enjoyed enormous savings, by investing in cloud computing and moving away from traditional brick-and-mortar infrastructure services. Computing as a service makes it easy for application developers to focus on writing the application, rather than focusing on the infrastructure that runs it. Although GCP is relatively new compared with, say, Amazon Web Services or Rackspace, GCP is gaining rapidly in popularity and offers a viable solution for an enterprise to run its applications. 53 Chapter 3 Virtualization with KVM This chapter covers designing and implementing enterprise-class virtualization solutions. I focus on Kernel-Based Virtual Machine (KVM) because it is Linux based. The topics in this chapter include how to understand virtualization, select hardware, and configure networks; storage; file system choices; optimization; security concerns; and a reference architecture to put it all together. What Is Virtualization? Virtualization of the operating system is creating a virtual machine (VM) within another machine. The host is called the hypervisor and the guest is called a virtual machine. As seen in Figure 3-1, five VMs are running on a single physical box. Assuming the host, or hypervisor, is running RedHat or CentOS, and the VMs are also running the same, you end up with six copies of the operating system. Figure 3-1. Virtualization in a nutshell With KVM, you first install the base operating system, then the KVM packages, and after that you can start creating VMs. Some advantages of using virtualization for an enterprise are as follows: Reduced capital expenditure because you buy fewer servers• Faster provisioning because you can scale on demand• Reduced energy costs because of fewer servers• Disaster recovery made easier using high availability• Chapter 3 ■ Virtualization with KVM 54 Easier to support legacy applications• One step closer to moving to the cloud• Reduced support requirements because of smaller data center footprint• Virtualization is not a panacea, by any means. Some cons of using virtualization are as follows: The abstraction layer of virtualization adds a performance penalty.• Overprovisioning is easy to do on a virtualization platform, resulting in degraded system • performance during peak hours. Slow adoption of software-defined networks has resulted in difficult-to-manage virtual • networks, and congested virtual networks. Rewriting of applications to be more virtual/cloud friendly can result in additional up-front • costs of adoption. Loss of a hypervisor can result in the loss of numerous VMs on the hypervisor.• Virtualization administration requires additional training and processes in the operations • world. Virtualization Solutions Some of the different enterprise-class virtualization solutions available are the following: LXC• https://linuxcontainers.org/ OpenVZ• http://openvz.org/Main_Page QEMU/KVM• http://www.linux-kvm.org/page/Main_Page VMware• http://www.vmware.com/ XenServer• http://www.xenserver.org/ Microsoft’s Hyper-V, Windows based• http://www.microsoft.com/en-us/server-cloud/solutions/virtualization.aspx Bhyve, FreeBSD based• http://bhyve.org/ This chapter covers KVM. The choice for which platform to pick can be complex. One possible option is to compare two or more solutions in your environment using virtualization benchmark software such as SPEC virt (http://www.spec.org/virt_sc2013/). With SPEC virt you spin up a number of VMs and then run different workloads, such as web servers, database servers, and more. At the end, SPEC virt spits out a bunch of numbers you can compare to determine whether XenServer, KVM, or another virtualization platform gives you better performance. Chapter 3 ■ Virtualization with KVM 57 # Creating a virtual machine. In this example, we are using br0, which is a network bridge, and routed mode. In addition, we are pointing to a local ISO image for installation, and displaying graphics of the VM using Spice. # virt-install --connect qemu:///system --name vm1.example.com \ --ram 32768 --vcpus 4 --disk path=/vm1/vm1.example.com.qcow2 \ --network=bridge:br0 --os-type=linux --os-variant=rhel6 \ --cdrom /vm1/iso/CentOS-6.4-x86_64-bin-DVD1.iso \ --graphics spice,password=mypassword –autostart # Enable libvirt to start automatically. # chkconfig libvirtd on # service libvirtd start # Starting a running VM # virsh start vm1.example.com # Stopping a running VM # virsh shutdown vm1.example.com # Shutting down a VM forcefully # virsh destroy vm1.example.com # Deleting a VM definition # virsh undefine vm1.example.com Automated KVM Installation Kickstart can be tuned to support hundreds of hosts at any given time. Out of the box, after you tune TFTP limits, you can easily clone 500 hypervisors at a time. Basically, you configure PXE to boot the hypervisors. After that, install CentOS or RedHat followed by the installation of KVM packages. Listing 3-2 shows a sample PXE Linux configuration file and Listing 3-3 shows a sample Kickstart configuration file. Listing 3-2. PXE Linux Config default menu.c32 prompt 0 timeout 5 menu title PXE Boot Menu label 1 menu label ^1 - Install KVM kernel images/centos/6.5/x86_64/vmlinuz APPEND text load_ramdisk=1 initrd=images/centos/6.5/x86_64/initrd.img network noipv6 ksdevice=eth0 ks=http://ks/kickstart/ks.cfg i8042.noaux console=tty0 label local menu label ^0 - Boot from first hard drive com32 chain.c32 append hd0 Chapter 3 ■ Virtualization with KVM 58 Listing 3-3. Kickstart Postinstall File # commands sections (required) bootloader --location=mbr authconfig --enableshadow keyboard us autopart # optional components clearpart -all firewall --disabled install --url http://ks.example.com/centos/6.4 network --bootproto=static --ip=10.1.1.100 --netmask=255.255.255.0 --gateway=10.1.1.1 --nameserver=10.1.1.10 #packages section (required) %packages @Virtualization # preinstall section (optional) %pre # postinstall section (optional) %post The question is: How do you make Kickstart of KVM enterprise friendly? Setting up a single Kickstart server is not sufficient for an enterprise. Using the reference architecture for Example.com defined later in this chapter, if we have three different sites, with at least 500 hypervisors per site, we need to set up numerous Kickstart servers per site. Also, because PXE is broadcast based, we have to set up IP helpers on routers between different networks of hypervisors. We want to avoid a flat network space for all hypervisors, because it’s a bit difficult to manage. An important question to answer is: How many concurrent Kickstarts are you expecting to take place? The solution for a Kickstart architecture is based on your answer. There are numerous ways to devise a solution, and I outline two possibilities in the following sections. Clustered Kickstart Solution With this solution we can set up two clusters: one for PXE booting and the other for serving CentOS installation files over HTTP. Per site there will be a pair of clusters. In the cluster, we will have a pair of load balancers and real servers. Instead of using PXE, let’s use iPXE (http://ipxe.org/), which supports PXE over HTTP. Another pair of DHCP servers running in primary and secondary mode will serve DHCP. There is no need to run DHCP behind a load balancer because, if you use Internet Systems Consortium (ISC) DHCPD (https://www.isc.org/downloads/dhcp/), primary and secondary modes are supported. The advantage of using a clustered solution is that you can grow on demand while reducing server sprawl. Each site gets a single cluster, and as incoming connections increase, you can increase the amount of real servers behind the load balancers to match the load. IP helpers have to be configured on the routers across the networks of the hypervisors to pass DHCP traffic to the DHCP server. An example is shown in Figure 3-3. Chapter 3 ■ Virtualization with KVM 59 The boot order of the hypervisors is as follows in BIOS: Hard drive• Network/PXE• During the first boot, because no operating system is installed, the hard drive boot will fail, and a boot is then tried off the network. Because we have configured PXE, the boot continues and the installation can start. The load balancers for the iPXE server and the Kickstart servers can be your enterprise-approved load balancer. HAProxy is a free load balancer (http://www.haproxy.org/) that can be used for the Kickstart server load balancing. HAProxy does not support UDP, so for PXE you may need a UDP-based load balancer, such as the one from F5 networks (https://f5.com/). Listing 3-4 shows a sample DHCPD configuration file when using PXE. Note the class pxeclients. Listing 3-4. DHCPD Configuration for PXE subnet 10.1.1.0 netmask 255.255.255.0 { option domain-name-servers 10.1.1.2; option routers 10.1.1.1; pool { failover peer "failover-partner"; range 10.1.1.50 10.1.1.250; } class "pxeclients" { match if substring(option vendor-class-identifier, 0, 9) = "PXEClient"; next-server 10.1.1.3; filename = "pxelinux.0"; } In the DHCPD configuration example, DHCP IP addresses are leased in the 10.1.1.0/24 subnet. The IP addresses provided will be in the range of 50 to 250. When a server boots, it first gets an IP from the DHCP server, at which point the DHCP server points to the PXE server using the next-server string. The booting server then uses TFTP to contact the iPXE server and download the PXE boot file, from which the server boots. After it boots using the PXE kernel, it can then download the CentOS or RedHat installation file and start the installation. Listing 3-5 shows a DHCPD primary server configuration file; Listing 3-6 shows the secondary DHCPD configuration file. Figure 3-3. KVM installation in clustered mode Chapter 3 ■ Virtualization with KVM 62 Performing "smolt-uuid" ... Performing "script" ... Performing "samba-db-log" ... Performing "rpm-db" ... Performing "rhn-systemid" ... Performing "random-seed" ... Performing "puppet-data-log" ... Performing "pam-data" ... Performing "package-manager-cache" ... Performing "pacct-log" ... Performing "net-hwaddr" ... Performing "net-hostname" ... Performing "mail-spool" ... Performing "machine-id" ... Performing "logfiles" ... Performing "hostname" ... Performing "firstboot" ... Performing "dovecot-data" ... Performing "dhcp-server-state" ... Performing "dhcp-client-state" ... Performing "cron-spool" ... Performing "crash-data" ... Performing "blkid-tab" ... Performing "bash-history" ... Performing "abrt-data" ... Performing "lvm-uuids" ... # Using the earlier created template, clone a new VM. # virt-clone -o centos.template -n newclone -f /vm1/newclone.img Allocating 'newclone.img' | 8.0 GB 00:09 Clone 'newclone' created successfully. # Make sure you can see the new cloned virtual machine # virsh list --all Id Name State ---------------------------------------------------- 1 vm1.example.com running 2 vm2.example.com running - centos.template shut off - newclone shut off # Start the new cloned VM. # virsh start newclone Domain newclone started #Ensure that it is running. # virsh list Id Name State ---------------------------------------------------- 1 vm1.example.com running 2 vm2.example.com running 3 newclone running Chapter 3 ■ Virtualization with KVM 63 We have to use virt-sysprep to prepare the image for cloning. virt-sysprep modifies the image and removes certain settings, or unconfigures certain settings that, if left, would conflict on another VM. sysprep stands for system preparation. You can read more about it at http://libguestfs.org/virt-sysprep.1.html. virt-clone clones the template created from virt-sysprep. There are two types of clones: linked clones and full clones. Linked clones depend on the image from which it was cloned, and the original image cannot be deleted. A full clone, on the other hand, is independent of the image from which it was cloned. KVM Management Solutions There are numerous solutions available to manage KVM, and some of them are free whereas others are commercial products. You can find a list of such solutions at http://www.linux-kvm.org/page/Management_Tools. The choices can be split broadly into two categories: one is command line or shell based and the other is graphical or graphical user interface based. oVirt (http://www.ovirt.org/Home) is a very popular open-source software used to manage KVM instances. RedHat has a commercial product built around oVirt called RHEVM, or RedHat Enterprise Virtualization Manager (http://www.redhat.com/products/cloud-computing/virtualization/). A third option is to write your own management using the libvirt API. You can find out more about libvirt at http://libvirt.org/. Libvirt Libvirt is a toolkit that support interactions with various virtualization platforms, KVM being one of them. The API for libvirt is extensive and is very useful if you write your own management around KVM. The C library reference for libvirt can be found at http://libvirt.org/html/libvirt-libvirt.html. Numerous language bindings are available for libvirt, such as C#, Java, OCaml, Perl, PHP, Python ,and Ruby. In the following example I use the Python bindings to demonstrate how to write a basic management application for KVM (Listing 3-8). You can find out more information about the bindings at http://libvirt.org/bindings.html. Listing 3-8. Sample Libvirt Python Code import libvirt import sys # Open a read-only connection to the local hypervisor. conn = libvirt.openReadOnly(None) if conn == None: print 'Failed to open connection to the hypervisor' sys.exit(1) # Get some information about the hypervisor. hv_info = conn.getInfo() # Print out the architecture, memory, cores, and speed of the processor. print 'hv arch {0}'.format(hv_info[0]) print 'hv memory {0}'.format(hv_info[1]) print 'cores in hv {0}'.format(hv_info[2]) print 'Mhz speed of hv CPU {0}'.format(hv_info[3]) Chapter 3 ■ Virtualization with KVM 64 virsh virsh is included with KVM and it is a quick and easy way of managing KVM. You can use virsh and skip other management solutions if you prefer simplicity. virsh uses libvirt. However, you have to script around virsh to manage a large number of hypervisors. Some examples of using virsh are shown in Listing 3-9. You can read more about virsh at http://linux.die.net/man/1/virsh. Listing 3-9. virsh Examples # Given a file called hv.txt, which contains a list of KVM hypervisors, loop through the file and get a list of VMs running on each hypervisor. # cat hv.txt hv1.example.com hv2.example.com hv3.example.com # cat hv.sh #!/bin/bash HVLIST=hv.txt USER=fakeuser for hv in `cat ${HVLIST}` do echo ${hv} virsh -c qemu+ssh://${USER}@${hv}/system list done # When you run hv.sh, below is a sample output you may get. # ./hv.sh hv1.example.com Id Name State ---------------------------------------------------- 1 vm1.example.com running 3 vm2.example.com running 4 vm3.example.com running hv2.example.com Id Name State ---------------------------------------------------- 1 vm4.example.com running 3 vm5.example.com running 4 vm6.example.com running hv3.example.com Id Name State ---------------------------------------------------- 1 vm7.example.com running 3 vm8.example.com running 4 vm9.example.com running Chapter 3 ■ Virtualization with KVM 67 Listing 3-10. KVM Hypervisor Information # Get version information. # virsh version Compiled against library: libvirt 0.10.2 Using library: libvirt 0.10.2 Using API: QEMU 0.10.2 Running hypervisor: QEMU 0.12.1 #View information about a hypervisor. # virsh sysinfo <sysinfo type='smbios'> <bios> <entry name='vendor'>Dell Inc.</entry> <entry name='version'>2.2.3</entry> <entry name='date'>10/25/2012</entry> <entry name='release'>2.2</entry> </bios> <system> <entry name='manufacturer'>Dell Inc.</entry> <entry name='product'>PowerEdge T110 II</entry> <entry name='version'>Not Specified</entry> <entry name='serial'>XXXXXXX</entry> <entry name='uuid'>4C4C4544-0034-5210-8054-B7C04F435831</entry> <entry name='sku'>Not Specified</entry> <entry name='family'>Not Specified</entry> </system> <processor> <entry name='socket_destination'>CPU1</entry> <entry name='type'>Central Processor</entry> <entry name='family'>Xeon</entry> <entry name='manufacturer'>Intel(R) Corporation</entry> <entry name='signature'>Type 0, Family 6, Model 58, Stepping 9</entry> <entry name='version'>Intel(R) Xeon(R) CPU E3-1230 V2 @ 3.30GHz</entry> <entry name='external_clock'>100 MHz</entry> <entry name='max_speed'>4000 MHz</entry> <entry name='status'>Populated, Enabled</entry> <entry name='serial_number'>NotSupport</entry> <entry name='part_number'>FFFF</entry> </processor> <memory_device> <entry name='size'>4096 MB</entry> <entry name='form_factor'>DIMM</entry> <entry name='locator'>DIMM A2</entry> <entry name='bank_locator'>BANK 0</entry> <entry name='type'>DDR3</entry> <entry name='type_detail'>Synchronous Unbuffered (Unregistered)</entry> <entry name='speed'>1600 MHz</entry> <entry name='manufacturer'>80CE000080CE</entry> Chapter 3 ■ Virtualization with KVM 68 <entry name='serial_number'>85DF74FD</entry> <entry name='part_number'>M391B5273DH0-YK0</entry> </memory_device> [SNIP] </sysinfo> Designing KVM Networks John Burdette Gage from Sun Microsystems once said, “The network is the computer,” which is very true. You can have the best storage solution coupled with the best physical hardware, but without a fast network, they are of no use. Most modern networks have at least one or more 10GB network adapters per physical server. When it comes to virtualization with KVM, 10GB is the minimum you should have per network interface. On top of this, the network topology layer is more important. With enterprise networks, the question of having a flat network space is not a consideration. Open vSwitch (http://openvswitch.org/) is a popular virtual switch that can be used with KVM. The advantage of using Open vSwitch is that it offers flexibility, which KVM networking does not necessarily offer. It is programmatically configurable and it supports numerous features that are enterprise friendly. KVM supports the following kinds of networking for assigning IP addresses to VMs: Network address translation (NAT) virtual networks• Bridged networks• Physical device assignment for Peripherel Component Interconnect (PCI)• Single root input/output virtualization• How many physical network cards should you have on your hypervisors? One possibility is to divide the network interfaces as follows: Two in fail-over mode for storage• Two in fail-over mode for management of KVM• Two in fail-over mode for VMs• One for out-of-band Intelligent Platform Management Inteface (IPMI)-based access• An example is shown in Figure 3-5. Figure 3-5. Network interfaces on hypervisor Chapter 3 ■ Virtualization with KVM 69 Network Address Translation NAT is generally used with private IP space, RFC 1918, which is 10.0.0.0/8, 172.16.0.0/12 and 192.168.0.0/16. By default, KVM picks IP space in the range of 192.168.122.0/24 for VMs. If you do decide to use NAT, and your hypervisor itself is on a NAT network, then you are, in essence, enabling double NAT for VM access, as shown in Figure 3-6. Figure 3-6. NAT mode Figure 3-7. Bridged mode Bridged Network In bridged network mode, no NAT is done between the VM and the physical network. The VM behaves as though it is another node on the physical network. If the physical hypervisor is on a NAT network, then the VM shares the same network. The advantage of this is reduced complexity of overall network administration. An example is shown in Figure 3-7. Chapter 3 ■ Virtualization with KVM 72 Designing KVM Storage When it comes to storage, you have to consider the end goal and base your decision on that. If the end goal is to have disposable VMs, with redundancy built in to the application, then having a common, shared storage solution across the KVM hypervisors may not be required. On the other hand, if you are looking for fault tolerance and want to have your VM up even if one or more hypervisors goes down, then shared storage is the way to go. Without shared storage, when a hypervisor goes down, so do the VMs on that hypervisor. Because the VM image is stored on the hypervisor, or on a storage subsystem that is not shared with other hypervisors, other active hypervisors cannot access the image of the VM and cannot start the VM that is down. You basically have to bring up the hypervisor that crashed, then start the VMs that were running on it. With shared storage, each of the hypervisors has access to the same VM image file, allowing a VM to come up on any other hypervisor if the host hypervisor crashes. Another issue to take into consideration when designing storage is whether to boot the hypervisor from a local disk or to use a network boot. Using a network boot eliminates the need for local hard drives and can save money. On the other hand, this solution adds to complexity, because you have to invest in a storage area network that supports network booting. Without shared storage, each hypervisor has its own disk on which both KVM and the local disk of the VM resides, as shown in Figure 3-9. Figure 3-10. KVM with shared storage Figure 3-9. KVM without shared storage With shared storage, each hypervisor has its own disk on which KVM is installed, and the VMs are stored on the shared storage. The shared storage could be a shared area network (SAN) as shown in Figure 3-10. With shared storage, and no local disk, the hypervisor boots from the SAN, and the VMs are also stored on the SAN, as seen in Figure 3-11. Chapter 3 ■ Virtualization with KVM 73 What kind of shared storage should you use if you do decide that you need the flexibility offered with shared storage? Some of the options available include the following: NFS• iSCSI• Fiber channel-based LUNs• With NFS, a dedicated NFS server or an NFS appliance is suitable. In enterprise networks, NFS appliances tend to be more prevalent for shared storage of VMs rather than Linux servers dedicated to running NFS. The advantage of the NFS appliance, such as NetApp, is that you are more likely to get faster performance compared with a Linux server running NFS. You cannot boot a hypervisor using NFS alone, but you can use an NFS-mounted partition on the hypervisor to store your VM images. iSCSI can be used to boot your hypervisor off the network. You can install iSCSI on a Linux box or you can use dedicated storage appliances, such as the Netapp, which supports iSCSI. The iSCSI target will be the storage appliance, and the initiator will be the hypervisor. It is recommended that you use a dedicated network card on your hypervisor if you decide to use iSCSI from which to boot. If you decide not to use iSCSI from which to boot the hypervisor, then you can still use iSCSI to mount a LUN and store your virtual images. You will have to use iSCSI multipath to have the same LUN be visible across other hypervisors. Image Selection The kind of VM image you select has an impact on the amount of storage being used and the performance of the VM. A few image types that are available and listed on the man page (http://linux.die.net/man/1/qemu-img) include the following: Raw• Qcow2• Qcow• Cow• Vdi• Vmdk• Vpc• Cloop• Figure 3-11. KVM with no local disk and SAN boot Chapter 3 ■ Virtualization with KVM 74 The most popular ones from the list are Qcow/Qcow2 and Raw. Numerous studies have been done on performance and storage use of one versus the other. Raw images have better performance than Qcow2 images; however, you cannot “snapshot” raw images. One advantage of taking a snapshot of a VM is that you can take a snapshot before a code deployment and, if the deployment does not work out well, you can simply revert to the previous version of the snapshot (Listing 3-13). Listing 3-13. Snapshot Management # Creating a snapshot # virsh snapshot-create vm1.example.com Domain snapshot 1407102907 created # Viewing a list of snapshots # virsh snapshot-list vm1.example.com Name Creation Time State ------------------------------------------------------------ 1407102907 2014-08-03 14:55:07 -0700 shutoff # Getting snapshot information # virsh snapshot-info vm1.example.com --current Name: 1407102907 Domain: vm1.example.com Current: yes State: shutoff Location: internal Parent: - Children: 0 Descendants: 0 Metadata: yes # View XML information about snapshot # virsh snapshot-dumpxml vm1.example.com 1407102907 <domainsnapshot> <name>1407102907</name> <state>shutoff</state> <creationTime>1407102907</creationTime> <memory snapshot='no'/> <disks> <disk name='vda' snapshot='internal'/> <disk name='hdc' snapshot='no'/> </disks> <domain type='kvm'> <name>vm1.example.com</name> <uuid>ba292588-6570-2674-1425-b2ee6a4e7c2b</uuid> <memory unit='KiB'>1048576</memory> <currentMemory unit='KiB'>1048576</currentMemory> <vcpu placement='static'>1</vcpu> <os> <type arch='x86_64' machine='rhel6.4.0'>hvm</type> <boot dev='hd'/> </os> Chapter 3 ■ Virtualization with KVM 77 When doing CentOS or RedHat installations, select Minimal Install, and nothing else to start with. This strategy reduces the size of the image. Smaller images clone more quickly and, even if you use Kickstart, they install faster. In addition, you reduce your security exposure by having only the minimal set of packages needed. # Configure tuned for virtual guest # tuned-adm profile virtual-guest # chkconfig tuned on tuned is a system-tuning daemon that sets various system/kernel parameters based on profiles. Disable unwanted services. Leaving them enabled slows down the boot process and is a security risk. Here are some services that are potential candidates to be turned off: # for svc in ip6tables cups abrtd abrt-ccpp atd kdump mdmonitor NetworkManager; do chkconfig $svc off; done root# for svc in ip6tables cups abrtd abrt-ccpp atd kdump mdmonitor NetworkManager; do service $svc stop; done # Disable IPv6 if you are not using it. # echo "NETWORKING_IPV6=no IPV6INIT=no" >> /etc/sysconfig/network # echo " # Disable IPv6. net.ipv6.conf.all.disable_ipv6 = 1 net.ipv6.conf.default.disable_ipv6 = 1 " >> /etc/sysctl.conf # smartd monitors hard drives; no need for that on a VM root# service smartd stop root# chkconfig --del smartd # Allow virsh shutdown to turn of the VM. # If we do a minimal CentOS install, acpid is not installed by default. # yum install acpid # chkconfig acpid on # service acpid start To access the console of a VM using the virsh console command, you have to redirect the VM console output via the serial console. The following steps show you how to do this: # Getting serial console to work on KVM with RHEL 6 and also with GRUB # Comment out splashimage and hiddenmenu # Remove 'rhgb' and 'quiet' from the kernel line # Add the 'serial' and the 'terminal' line # Add the last two 'console' parameters on the kernel line # Now try to access the serial console using 'virsh console <hostname>' # cat /etc/grub.conf # grub.conf generated by anaconda # # Note you do not have to rerun grub after making changes to this fil.e # NOTICE: You have a /boot partition. This means that Chapter 3 ■ Virtualization with KVM 78 # all kernel and initrd paths are relative to /boot/, e.g., # root (hd0,0) # kernel /vmlinuz-version ro root=/dev/mapper/vg_ns-lv_root # initrd /initrd-[generic-]version.img # boot=/dev/vda default=0 timeout=10 serial --unit=0 --speed=115200 terminal --timeout=5 serial console #splashimage=(hd0,0)/grub/splash.xpm.gz #hiddenmenu title CentOS (2.6.32-431.el6.x86_64) root (hd0,0) kernel /vmlinuz-2.6.32-431.el6.x86_64 ro root=/dev/mapper/vg_ns-lv_root rd_NO_LUKS LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 rd_LVM_LV=vg_ns/lv_root crashkernel=auto KEYBOARDTYPE=pc KEYTABLE=us rd_LVM_LV=vg_ns/lv_swap rd_NO_DM console=tty0 console=ttyS0,115200 initrd /initramfs-2.6.32-431.el6.x86_64.img Security Considerations In a virtual environment, where many VMs may be running on a physical hypervisor, the VMs may have applications on them that are not related to one another. VMs are processes like any other process on a hypervisor. In theory, it is possible to break out of a VM and access the hypervisor, and then from there, access another VM. In general, KVM maintainers are pretty quick to address security concerns. SELinux offers a layer of security for both the hypervisor and the VM. If you do decide to use SELinux, keep in mind there are some changes that need to be made to get KVM working with SELinux. iptables on both the hypervisor and the VM can protect the hosts at the network layer. The network design and VM distribution also play a role in security. For instance, some enterprises may choose to have application-specific hypervisors. So, if there are, say, two dozen applications, then there are at least two dozen sets of numerous hypervisors per application. This sort of segregation may offer protection if one or more applications has more security risks than the other application. Networks, of course, play a huge role. Having a separate management network for the hypervisors, for IPMI management, for storage, and, last, per application networks helps by adding more layers to the security blanket. Overall, the amount of security has to be balanced with the ease of use. Keep in mind any government or other security standards such as those of the National Institute of Standards and Technology and PCI may have to be followed as well. Reference Architecture Based on what we have learned so far in this chapter, let’s design the enterprise virtualization strategy for a sample enterprise called Example.com. The company is an online retailer. A majority of its servers are running Apache for serving its online retail needs. There are also some MySQL database servers that act as the back end for Apache. The reference architecture consists of an enterprise that has three data centers—one in Oregon (OR), the other in Texas (TX), and the third in Virginia (VA)— seen in Figure 3-12. Each data center has 500 hypervisors, and each hypervisor has ten VMs running on it, for a total of 5000 VMs per site. Across the three sites, there is a total of 15,000 VMs. The VMs are all Linux based, running CentOS 6.5. The hypervisors are running CentOS 6.5 with KVM installed on them. Chapter 3 ■ Virtualization with KVM 79 The hardware of the hypervisors consists of name-brand servers, with seven 10GB network cards, two for the storage network, two for the management network, one for IPMI, and two for the application networks. There are two solid state drives (SSD ) based hard drives in each server, both the drives are mirrored, and the drive size is 120GB each. Two CPUs are present on each server; the CPUs are Intel E7-8870 2.4Ghz Xeon, which have 10 cores and 20 threads. Each thread can be allocated as a virtual CPU to a VM for a total of 40 virtual CPUs. With a VM density of ten VMs per hypervisor, we can assign four virtual CPUs to each VM. Each VM needs an average of 32GB of physical memory, for a total memory footprint of 320GB per server. Add to that the overhead of the hypervisor and you need 384GB of physical memory. The server has 12 physical memory sockets; if we stick in 32GB DIMMS, that will give us a total of 384GB of physical RAM, which will suit our needs. For storage, Example.com uses NetApp NFS filers to store the VM images. Shares from the filers are mounted on the hypervisors. Because each site has 500 hypervisors, the hypervisors are divided into smaller pods of 100 hypervisors each, for a total of 5 pods per site. A pod consists of 100 hypervisors, 1000 VMs, and a filer that is part of a cluster. The cluster has at least two heads and numerous disk shelves. A sample pod is shown in Figure 3-13. Figure 3-12. Example.com data center sites Figure 3-13. A pod with ten hypervisors and a cluster of NetApp filers Chapter 4 ■ MySQL, Git, and poStfix 82 Cloud databases are hosted in public clouds. Google hosts cloud databases such as Google Cloud Storage and Google Cloud Datastore. There are also other cloud databases, such as Amazon DynamoDB, which is a NoSQL database. Distributed databases store data across a number of instances, and hence they are called distributed. An example of an open source distributed database is Cassandra (https://cassandra.apache.org/). Distributed databases are alternatives to RDBMSs and object databases for data that is massive in size. For instance, social networking data for millions of users might be a potentially suitable candidate for Cassandra. NewSQL databases provide the scalability of NoSQL with the atomicity, consistency, isolation, and durability guarantee of RDBMSs. VoltDB (http://voltdb.com/) and Clustrix (http://www.clustrix.com/) are examples of NewSQL databases. Picking a Database There are numerous open source database solutions available. Some of the popular ones include the following: MySQL• MariaDB (a fork of MySQL; very similar to it)• Cassandra• PostgreSQL• MongoDB• CouchDB• SQLite• Redis• There are many other databases. I have listed a few that, in my experience, have a wide installed base. You can view a more complete listing at https://en.wikipedia.org/wiki/List_of_relational_database_management_systems and also at http://www.fromdev.com/2012/07/best-open-source-nosql-database.html. A few major commercial databases that run on Linux are Oracle (• http://www.oracle.com/index.html) Informix (• http://www-01.ibm.com/software/data/informix/) IBM DB2 (• http://www-01.ibm.com/software/data/db2/) The choice of a database should be based on factors such as the following: Licensing• Ease of use• Support from community• Commercial support availability• Type of database required (object, relational, NoSQL)• Frequency of updates• Database suitability for application• Chapter 4 ■ MySQL, Git, and poStfix 83 When designing a human resources application, as mentioned the data is relational in nature—employees have a name, social security number, salary, and other such related information. So, picking a relational database would be appropriate. For multimedia applications, such as photo, video, and art, object databases are more popular, because these media can be stored easily as objects. Cloud databases have the advantage of being managed by the cloud provider; however, your data end up being in the hands of the cloud provider, which may not be ideal in all cases. Installing MySQL There are at least two options for installing MySQL: 1. Using the Yellowdog Updater, Modified (YUM) repositories with CentOS/RedHat 2. Downloading source code and compiling on your own # yum install mysql-server Loaded plugins: fastestmirror ...[SNIP]... Installed: mysql-server.x86_64 0:5.5.39-1.el6.remi Dependency Installed: mysql.x86_64 0:5.5.39-1.el6.remi Dependency Updated: mysql-libs.x86_64 0:5.5.39-1.el6.remi Complete! You may also notice that if you do a yum install mysql-server, MariaDB is installed instead. Some distributions have deprecated MySQL in favor of MariaDB, which is a drop-in replacement, as explained in the section “Future of MySQL.” # yum install mysql-server Loaded plugins: fastestmirror, security Setting up Install Process Package mysql-server is obsoleted by MariaDB-server, trying to install MariaDB-server-10.0.12-1.el6.x86_64 instead ...[SNIP]... Installed: MariaDB-compat.x86_64 0:10.0.12-1.el6 MariaDB-server.x86_64 0:10.0.12-1.el6 Dependency Installed: MariaDB-client.x86_64 0:10.0.12-1.el6 MariaDB-common.x86_64 0:10.0.12-1.el6 Replaced: mysql-libs.x86_64 0:5.1.66-2.el6_3 Complete! If possible, use precompiled binaries for MySQL that your Linux distribution provides. It’s a lot easier to maintain and manage. However, in case you want to build your own MySQL, you have to download the source code, compile it, and then install the compiled version. MySQL downloads are available from https://dev.mysql.com/downloads/. The MySQL Community Server is a good place to start. Chapter 4 ■ MySQL, Git, and poStfix 84 COMpILING MYSQL download the latest version. as of this writing, 5.6 is the latest. # wget --no-check-certificate https://dev.mysql.com/get/Downloads/MySQL-5.6/MySQL-5.6.20-1. el6.src.rpm --2014-09-10 16:44:06-- https://dev.mysql.com/get/Downloads/MySQL-5.6/MySQL-5.6.20-1.el6. src.rpm ...[SNIP]... 2014-09-10 16:44:13 (4.72 MB/s) - “MySQL-5.6.20-1.el6.src.rpm” saved [31342030/31342030] # ls MySQL-5.6.20-1.el6.src.rpm # rpm -Uvh ./MySQL-5.6.20-1.el6.src.rpm 1:MySQL ########################################### [100%] Untar the distribution. # cd /usr/local/src # cp /root/rpmbuild/SOURCES/mysql-5.6.20.tar.gz. # ls MySQL-5.6.20-1.el6.src.rpm mysql-5.6.20.tar.gz # tar xvfz mysql-5.6.20.tar.gz mysql-5.6.20/ mysql-5.6.20/Docs/ install cmake and ncurses-devel. # cd mysql-5.6.20 # yum install cmake Loaded plugins: fastestmirror, security Loading mirror speeds from cached hostfile ...[SNIP]... Installed: cmake.x86_64 0:2.6.4-5.el6 # yum install ncurses-devel -y Loaded plugins: fastestmirror, security Loading mirror speeds from cached hostfile Setting up Install Process ...[SNIP]... Installed: ncurses-devel.x86_64 0:5.7-3.20090208.el6 Chapter 4 ■ MySQL, Git, and poStfix 87 MySQL Proxy is another piece of software that is in alpha release at this point. The proxy handles splitting read/writes between two master servers automatically. The proxy is, of course, a single point of failure. Because the software is in alpha release, using it in production may not be suitable. If you decide to use it for nonproduction environments, then set it up as shown in Figure 4-3. The web server is not aware that it is using a proxy, nor does it have to be made aware. Because of the alpha nature of MySQL Proxy, it should be used in noncritical environments only. The scalability of MySQL proxy, and its speed, should be tested before you start using it actively. Ensure the version of MySQL being used is at least version 5.0, so that it works with MySQL Proxy. MySQL Proxy may also be installed on a given MySQL server; however, it is better if it is installed on another server, because in the event that one MySQL server crashes, you do not lose the proxy server with it and you can continue working. Additional information about MySQL Proxy can be obtained at https://dev.mysql.com/doc/refman/5.0/en/mysql-proxy.html. Figure 4-2. MySQL master/slave design Figure 4-3. MySQL failover with MySQL Proxy Another option is to use HAProxy instead of MySQL Proxy (Figure 4-4). The web server sends all MySQL requests to the HAProxy IP. HAProxy, in turn, checks the availability of both MySQL servers., and forwards the requests to the available MySQL server based on the load-balancing algorithm specified. HAProxy is available at http://www.haproxy.org/ Chapter 4 ■ MySQL, Git, and poStfix 88 Another option is to have individual MySQL database instances per application and let the application owners manage the instances as well as their own data (Figure 4-6). This means that each application ownership team has to have MySQL administration experience. MySQL Enterprise Design Designing an enterprise MySQL installation is challenging. There are at least two different models, if not more, for an enterprise design. One option is to offer MySQL as a service, or database as a service (DBaaS), for everyone in the enterprise (Figure 4-5). One or more teams will manage a few clusters of MySQL database servers, and will provide a database login, password, and server name to which application developers can connect and use. The back-end database, backups, maintenance, and upgrade are all maintained by the team providing the service. The advantage of this model is that it is cost efficient. Maintenance is relatively easier, because it’s done at a single point. Individual application teams do not have to worry about setting up their own instances and managing them. Security issues can be resolved more quickly because all corporate MySQL instances are tracked through the team managing them. Figure 4-4. MySQL failover with HAProxy Figure 4-5. MySQL in-house private cloud (DBaaS) for Web, Enterprise Resource Planning (ERP), Development (Dev) and Quality Assurance (QA) Chapter 4 ■ MySQL, Git, and poStfix 89 Figure 4-6. Per-application MySQL servers and no in-house MySQL cloud Figure 4-7. Per-application redundant MySQL servers A more robust approach takes per-application instances and makes them redundant. In this case, each team maintains its own database. However the databases are running in master/master or master/slave configuration for failover capabilities (Figure 4-7). A hybrid model involves both—using DBaaS for small-scale applications and using dedicated instances for large-scale applications (Figure 4-8). Considering the fact that technology is moving toward providing infrastructure as a service, it has become common for enterprises to adopt the DBaaS model. Figure 4-8. MySQL Hybrid with DBaaS as well as dedicated instances The process of setting up MySQL replication can be found at https://dev.mysql.com/doc/refman/5.0/en/ replication.html. Chapter 4 ■ MySQL, Git, and poStfix 92 An enterprise backup strategy is a lot easier when MySQL is offered as a DBaaS, because you can take one of the multiple masters in a DBaaS and make a backup with write locks without affecting the applications. If we have a single master server, then the lock table command might cause an application issue for large tables. Getting Help with MySQL MySQL support options include the following: Commercial support from Oracle (• https://www.mysql.com/support/) Community forums (• http://forums.mysql.com/) Documentation (• https://dev.mysql.com/doc/) Internet Relay Chat (IRC)channel (• https://dev.mysql.com/doc/refman/5.0/en/irc.html) Mailing list (• http://lists.mysql.com/) MySQL development project is hosted at • https://launchpad.net/mysql, if you want to contribute to the project. • Expert MySQL is a good guide for MySQL (http://www.apress.com/9781430246596) Future of MySQL MySQL was owned and operated by a single for-profit company, the Swedish company MySQL AB, until it was purchased by Sun Microsystems in 2008. Sun was later acquired by Oracle in 2009. The open source community viewed the ownership of MySQL by Oracle as inherently a conflict of interest, and has come up with a drop-in MySQL replacement called MariaDB (https://mariadb.org/). The main author of MySQL, Michael “Monty” Widenius, supported splitting up from MySQL into MariaDB. Popular Linux distributions have already switched to providing MariaDB as part of their distribution instead of MySQL. However, MySQL continues to enjoy a large installed base and is still very popular. Should an enterprise pick MySQL or MariaDB? This question can be best answered by reviewing the enterprise policy on open source software. If an enterprise has a culture of using open source software and agrees with the viewpoint that Oracle ownership of MySQL creates an uncertain future for MySQL, then MariaDB is the right choice. On the other hand, Oracle has, for the past few years, continued the development of MySQL so one could argue that this is sufficient reason for investing in MySQL. As time progresses, there is a possibility that MariaDB will become more different than MySQL. E-mail in an Enterprise E-mail plays at least two important roles in an enterprise. One role is the obvious one—communications, both internally and externally. The other is the use of e-mail for infrastructure management. Some examples of using e-mail in infrastructure management include the following: For companies that use monitoring tools such as Nagios, e-mail is one way that alerts are delivered.• E-mail from system services, such as cron, can be collected at a central location and analyzed • for troubleshooting. Applications on servers can use e-mail gateways for relaying application-specific information.• Security teams may want to analyze system-related e-mail to determine whether any server or • application has been compromised. Chapter 4 ■ MySQL, Git, and poStfix 93 Enterprises have different strategies when dealing with mail from servers. Some of these strategies are Sending all system-related e-mail to • /dev/null and using extensive out-of-band monitoring of services along with SMS-based alerts Leaving system-related e-mail on respective systems, having them processed for security, • troubleshooting on the system itself, and then pruning as needed Having all e-mail from systems be forwarded to an e-mail gateway that then stores and • processes the e-mail As a site reliability engineer, your choice of action should take into consideration at least the following: The number of systems you have, because if you manage thousands of servers, receiving • e-mail from each of them may not be a practical How important system-related e-mail is to you, because you can get the same information that • system-related e-mail provides through a comprehensive monitoring architecture Configuring MTAs in applications versus relying on system-provided MTA. If the application • can send e-mail to a gateway without relying on the system mailer, this option saves you the trouble of configuring e-mail on the servers. E-mail Solution Strategy An effective enterprise e-mail solution strategy is crucial for business. There are two main choices when it comes to an e-mail solution: Setting up e-mail in-house using open source software or commercial software; an example of • this is to use Postfix or Microsoft Exchange Using a cloud provider for e-mail, such as Google, Microsoft, or Rackspace• The choice of building in-house or using a public cloud depends on a number of factors. During the decision-making process, spend a lot of time on analyzing the pros and the cons, then make your decision. A few things to keep in mind are the following: Any regulations an organization is subject to that would prevent it from hosting e-mail in a • cloud provider Cost of the cloud solution versus an in-house solution (for instance, Gmail for Work is around • $5 per person per month at list price) Whether an existing solution is in-house, because the effort to migrate to the cloud might be • too painful Client access methods for cloud e-mail providers (for example, if your users want to sync • e-mail/calendar with their smartphone) Hosted e-mail providers are numerous and the cost varies tremendously. Many of them provide e-mail, calendar, and a smartphone sync feature. Exchange hosting is also very popular, and there are numerous Microsoft partners that provide hosted Exchange accounts. Some of the options for hosted e-mail are: Google Gmail for Work (• https://www.gmail.com/intl/en/mail/help/about.html) Hosted Exchange (• https://office.microsoft.com/en-us/) Rackspace–hosted e-mail (• http://www.rackspace.com/email-hosting/webmail/) Chapter 4 ■ MySQL, Git, and poStfix 94 Having your e-mail in a public cloud is very convenient. A few advantages of using hosted e-mail are No capital expenditure on hardware or software• Engineering staff not required to maintain mail infrastructure• Total cost of ownership for hosting can be less than in-house infrastructure• Not everything is rosy with having your e-mail in a public cloud. Disadvantages of hosted e-mail include the following: Your e-mail security is in the hands of someone else.• Recurring operational cost may be higher than an in-house solution.• Not all e-mail requirements, such as archiving, may be available with hosting providers.• Loss of Internet connectivity from the corporate office may result in loss of e-mail • access as well. Enterprise Mail Transfer Agents A MTA is software that transfers mail between servers. A mail user agent is software that end users use to download and read e-mail. The choice of MTAs is huge; some of more popular ones include the following: Sendmail• Postfix• Qmail• Apache James• Exim• Sendmail used to be the default MTA on CentOS until CentOS 6.x was released, at which time Postfix became the default MTA. When deciding to pick an MTA for your organization, keep the following few items in mind: Open source or commercial?• If open source, do you need commercial support?• How active is the end user community?• How quickly are security holes plugged?• What is the release cycle of the product like?• Does the MTA meet the technical needs of speed and reliability for your organization?•