Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon

How-To Tutorials - Cloud Computing

121 Articles
article-image-high-availability-protection-and-recovery-using-microsoft-azure
Packt
02 Apr 2015
23 min read
Save for later

High Availability, Protection, and Recovery using Microsoft Azure

Packt
02 Apr 2015
23 min read
Microsoft Azure can be used to protect your on-premise assets such as virtual machines, applications, and data. In this article by Marcel van den Berg, the author of Managing Microsoft Hybrid Clouds, you will learn how to use Microsoft Azure to store backup data, replicate data, and even for orchestration of a failover and failback of a complete data center. We will focus on the following topics: High Availability in Microsoft Azure Introduction to geo-replication Disaster recovery using Azure Site Recovery (For more resources related to this topic, see here.) High availability in Microsoft Azure One of the most important limitations of Microsoft Azure is the lack of an SLA for single-instance virtual machines. If a virtual machine is not part of an availability set, that instance is not covered by any kind of SLA. The reason for this is that when Microsoft needs to perform maintenance on Azure hosts, in many cases, a reboot is required. Reboot means the virtual machines on that host will be unavailable for a while. So, in order to accomplish High Availability for your application, you should have at least two instances of the application running at any point in time. Microsoft is working on some sort of hot patching which enables virtual machines to remain active on hosts being patched. Details are not available at the moment of writing. High Availability is a crucial feature that must be an integral part of an architectural design, rather than something that can be "bolted on" to an application afterwards. Designing for High Availability involves leveraging both the development platform as well as available infrastructure in order to ensure an application's responsiveness and overall reliability. The Microsoft Azure Cloud platform offers software developers PaaS extensibility features and network administrators IaaS computing resources that enable availability to be built into an application's design from the beginning. The good news is that organizations with mission-critical applications can now leverage core features within the Microsoft Azure platform in order to deploy highly available, scalable, and fault-tolerant cloud services that have been shown to be more cost-effective than traditional approaches that leverage on-premises systems. Microsoft Failover Clustering support Windows Server Failover Clustering (WSFC) is not supported on Azure. However, Microsoft does support SQL Server AlwaysOn Availability Groups. For AlwaysOn Availability Groups, there is currently no support for availability group listeners in Azure. Also, you must work around a DHCP limitation in Azure when creating WSFC clusters in Azure. After you create a WSFC cluster using two Azure virtual machines, the cluster name cannot start because it cannot acquire a unique virtual IP address from the DHCP service. Instead, the IP address assigned to the cluster name is a duplicate address of one of the nodes. This has a cascading effect that ultimately causes the cluster quorum to fail, because the nodes cannot properly connect to one another. So if your application uses Failover Clustering, it is likely that you will not move it over to Azure. It might run, but Microsoft will not assist you when you encounter issues. Load balancing Besides clustering, we can also create highly available nodes using load balancing. Load balancing is useful for stateless servers. These are servers that are identical to each other and do not have a unique configuration or data. When two or more virtual machines deliver the same application logic, you will need a mechanism that is able to redirect network traffic to those virtual machines. The Windows Network Load Balancing (NLB) feature in Windows Server is not supported on Microsoft Azure. An Azure load balancer does exactly this. It analyzes incoming network traffic of Azure, determines the type of traffic, and reroutes it to a service.   The Azure load balancer is running provided as a cloud service. In fact, this cloud service is running on virtual appliances managed by Microsoft. These are completely software-defined. The moment an administrator adds an endpoint, a set of load balancers is instructed to pass incoming network traffic on a certain port to a port on a virtual machine. If a load balancer fails, another one will take over. Azure load balancing is performed at layer 4 of the OSI mode. This means the load balancer is not aware of the application content of the network packages. It just distributes packets based on network ports. To load balance over multiple virtual machines, you can create a load-balanced set by performing the following steps: In Azure Management Portal, select the virtual machine whose service should be load balanced. Select Endpoints in the upper menu. Click on Add. Select Add a stand-alone endpoint and click on the right arrow. Select a name or a protocol and set the public and private port. Enable create a load-balanced set and click on the right arrow. Next, fill in a name for the load-balanced set. Fill in the probe port, the probe interval, and the number of probes. This information is used by the load balancer to check whether the service is available. It will connect to the probe port number; do that according to the interval. If the specified number of probes all result in unable to connect, the load balancer will no longer distribute traffic to this virtual machine. Click on the check mark. The load balancing mechanism available is based on a hash. Microsoft Azure Load Balancer uses a five tuple (source IP, source port, destination IP, destination port, and protocol type) to calculate the hash that is used to map traffic to the available servers. A second load balancing mode was introduced in October 2014. It is called Source IP Affinity (also known as session affinity or client IP affinity). On using Source IP affinity, connections initiated from the same client computer go to the same DIP endpoint. These load balancers provide high availability inside a single data center. If a virtual machine part of a cluster of instances fails, the load balancer will notice this and remove that virtual machine IP address from a table. However, load balancers will not protect for failure of a complete data center. The domains that are used to direct clients to an application will route to a particular virtual IP that is bound to an Azure data center. To access application even if an Azure region has failed, you can use Azure Traffic Manager. This service can be used for several purposes: To failover to a different Azure region if a disaster occurs To provide the best user experience by directing network traffic to Azure region closest to the location of the user To reroute traffic to another Azure region whenever there's any planned maintenance The main task of Traffic Manager is to map a DNS query to an IP address that is the access point of a service. This job can be compared for example with a job of someone working with the X-ray machine at an airport. I'm guessing that you have all seen those multiple rows of X-ray machines. Each queue at an X-ray machine is different at any moment. An officer standing at the entry of the area distributes people over the available X-rays machine such that all queues remain equal in length. Traffic Manager provides you with a choice of load-balancing methods, including performance, failover, and round-robin. Performance load balancing measures the latency between the client and the cloud service endpoint. Traffic Manager is not aware of the actual load on virtual machines servicing applications. As Traffic Manager resolved endpoints of Azure cloud services only, it cannot be used for load balancing between an Azure region and a non-Azure region (for example, Amazon EC2) or between on-premises and Azure services. It will perform health checks on a regular basis. This is done by querying the endpoints of the services. If the endpoint does not respond, Traffic Manager will stop distributing network traffic to that endpoint for as long as the state of the endpoint is unavailable. Traffic Manager is available in all Azure regions. Microsoft charges for using this service based on the number of DNS queries that are received by Traffic Manager. As the service is attached to an Azure subscription, you will be required to contact Azure support to transfer Traffic Manager to a different subscription. The following table shows the difference between Azure's built-in load balancer and Traffic Manager:   Load balancer Traffic Manager Distribution targets Must reside in same region Can be across regions Load balancing 5 tuple Source IP Affinity  Performance, failover, and round-robin Level OSI layer 4 TCP/UDP ports OSI Layer 4 DNS queries Third-party load balancers In certain configurations, the default Azure load balancer might not be sufficient. There are several vendors supporting or starting to support Azure. One of them is Kemp Technologies. Kemp Technologies offers a free load balancer for Microsoft Azure. The Virtual LoadMaster (VLM) provides layer 7 application delivery. The virtual appliance has some limitations compared to the commercially available unit. The maximum bandwidth is limited to 100 Mbps and High Availability is not offered. This means the Kemp LoadMaster for Azure free edition is a single point of failure. Also, the number of SSL transactions per second is limited. One of the use cases in which a third-party load balancer is required is when we use Microsoft Remote Desktop Gateway. As you might know, Citrix has been supporting the use of Citrix XenApp and Citrix XenDesktop running on Azure since 2013. This means service providers can offer cloud-based desktops and applications using these Citrix solutions. To make this a working configuration, session affinity is required. Session affinity makes sure that network traffic is always routed over the same server. Windows Server 2012 Remote Desktop Gateway uses two HTTP channels, one for input and one for output, which must be routed over the same Remote Desktop Gateway. The Azure load balancer is only able to do round-robin load balancing, which does not guarantee both channels using the same server. However, hardware and software load balancers that support IP affinity, cookie-based affinity, or SSL ID-based affinity (and thus ensure that both HTTP connections are routed to the same server) can be used with Remote Desktop Gateway. Another use case is load balancing of Active Directory Federation Services (ADFS). Microsoft Azure can be used as a backup for on-premises Active Directory (AD). Suppose your organization is using Office 365. To provide single sign-on, a federation has been set up between Office 365 directory and your on-premises AD. If your on-premises ADFS fails, external users would not be able to authenticate. By using Microsoft Azure for ADFS, you can provide high availability for authentication. Kemp LoadMaster for Azure can be used to load balance network traffic to ADFS and is able to do proper load balancing. To install Kemp LoadMaster, perform the following steps: Download the Publish Profile settings file from https://windows.azure.com/download/publishprofile.aspx. Use PowerShell for Azure with the Import-AzurePublishSettingsFile command. Upload the KEMP supplied VHD file to your Microsoft Azure storage account. Publish the VHD as an image. The VHD will be available as an image. The image can be used to create virtual machines. The complete steps are described in the documentation provided by Kemp. Geo-replication of data Microsoft Azure has geo-replication of Azure Storage enabled by default. This means all of your data is not only stored at three different locations in the primary region, but also replicated and stored at three different locations at the paired region. However, this data cannot be accessed by the customer. Microsoft has to declare a data center or storage stamp as lost before Microsoft will failover to the secondary location. In the rare circumstance where a failed storage stamp cannot be recovered, you will experience many hours of downtime. So, you have to make sure you have your own disaster recovery procedures in place. Zone Redundant Storage Microsoft offers a third option you can use to store data. Zone Redundant Storage (ZRS) is a mix of two options for data redundancy and allows data to be replicated to a secondary data center / facility located in the same region or to a paired region. Instead of storing six copies of data like geo-replicated storage does, only three copies of data are stored. So, ZRS is a mix of local redundant storage and geo-replicated storage. The cost for ZRS is about 66 percent of the cost for GRS. Snapshots of the Microsoft Azure disk Server virtualization solutions such as Hyper-V and VMware vSphere offer the ability to save the state of a running virtual machine. This can be useful when you're making changes to the virtual machine but want to have the ability to reverse those changes if something goes wrong. This feature is called a snapshot. Basically, a virtual disk is saved by marking it as read only. All writes to the disk after a snapshot has been initiated are stored on a temporary virtual disk. When a snapshot is deleted, those changes are committed from the delta disk to the initial disk. While the Microsoft Azure Management Portal does not have a feature to create snapshots, there is an ability to make point-in-time copies of virtual disks attached to virtual machines. Microsoft Azure Storage has the ability of versioning. Under the hood, this works differently than snapshots in Hyper-V. It creates a snapshot blob of the base blob. Snapshots are by no ways a replacement for a backup, but it is nice to know you can save the state as well as quickly reverse if required. Introduction to geo-replication By default, Microsoft replicates all data stored on Microsoft Azure Storage to the secondary location located in the paired region. Customers are able to enable or disable the replication. When enabled, customers are charged. When Geo Redundant Storage has been enabled on a storage account, all data is asynchronous replicated. At the secondary location, data is stored on three different storage nodes. So even when two nodes fail, the data is still accessible. However, before the read access Geo-Redundant feature was available, customers had no way to actually access replicated data. The replicated data could only be used by Microsoft when the primary storage could not be recovered again. Microsoft will try everything to restore data in the primary location and avoid a so-called geo-failover process. A geo-failover process means that a storage account's secondary location (the replicated data) will be configured as the new primary location. The problem is that a geo-failover process cannot be done per storage account, but needs to be done at the storage stamp level. A storage stamp has multiple racks of storage nodes. You can imagine how much data and how many customers are involved when a storage stamp needs to failover. Failover will have an effect on the availability of applications. Also, because of the asynchronous replication, some data will be lost when a failover is performed. Microsoft is working on an API that allows customers to failover a storage account themselves. When geo-redundant replication is enabled, you will only benefit from it when Microsoft has a major issue. Geo-redundant storage is neither a replacement for a backup nor for a disaster recovery solution. Microsoft states that the Recover Point Objective (RPO) for Geo Redundant Storage will be about 15 minutes. That means if a failover is required, customers can lose about 15 minutes of data. Microsoft does not provide a SLA on how long geo-replication will take. Microsoft does not give an indication for the Recovery Time Objective (RTO). The RTO indicates the time required by Microsoft to make data available again after a major failure that requires a failover. Microsoft once had to deal with a failure of storage stamps. They did not do a failover but it took many hours to restore the storage service to a normal level. In 2013, Microsoft introduced a new feature called Read Access Geo Redundant Storage (RA-GRS). This feature allows customers to perform reads on the replicated data. This increases the read availability from 99.9 percent when GRS is used to above 99.99 percent when RA-GRS is enabled. Microsoft charges more when RA-GRS is enabled. RA-GRS is an interesting addition for applications that are primarily meant for read-only purposes. When the primary location is not available and Microsoft has not done a failover, writes are not possible. The availability of the Azure Virtual Machine service is not increased by enabling RA-GRS. While the VHD data is replicated and can be read, the virtual machine itself is not replicated. Perhaps this will be a feature for the future. Disaster recovery using Azure Site Recovery Disaster recovery has always been on the top priorities for organizations. IT has become a very important, if not mission-critical factor for doing business. A failure of IT could result in loss of money, customers, orders, and brand value. There are many situations that can disrupt IT such as: Hurricanes Floods Earthquakes Disasters such as a failure of a nuclear power plant Fire Human error Outbreak of a virus Hardware or software failure While these threads are clear and the risk of being hit by such a thread can be calculated, many organizations do not have a proper protection against those threads. In three different situations, disaster recovery solutions can help an organization to continue doing business: Avoiding a possible failure of IT infrastructure by moving servers to a different location. Avoiding a disaster situation, such as hurricanes or floods, since such situations are generally well known in advance due to weather forecasting capabilities. Recovering as quickly as possible when a disaster has hit the data center. Disaster recovery is done when a disaster unexpectedly hit the data center, such as a fire, hardware error, or human error. Some reasons for not having a proper disaster recovery plan are complexity, lack of time, and ignorance; however, in most cases, a lack of budget and the belief that disaster recovery is expensive are the main reasons. Almost all organizations that have been hit by a major disaster causing unacceptable periods of downtime started to implement a disaster recovery plan, including technology immediately after they recovered. However, in many cases, this insight came too late. According to Gartner, 43 percent of companies experiencing disasters never reopen and 29 percent close within 2 years. Server virtualization has made disaster recovery a lot easier and cost effective. Verifying that your DR procedure actually works as designed and matches RTO and RPO is much easier using virtual machines. Since Windows Server 2012, Hyper-V has a feature for asynchronous replication of virtual machine virtual disks to another location. This feature, Hyper-V Replica, is very easy to enable and configure. It does not cost extra. Hyper-V Replica is storage agnostic, which means the storage type at the primary site can be different than the storage type used in the secondary site. So, Hyper-V Replica perfectly works when your virtual machines are hosted on, for example, EMC storage while in the secondary a HP solution is used. While replication is a must for DR, another very useful feature in DR is automation. As an administrator, you really appreciate the option to click on a button after deciding to perform a failover and sit back and relax. Recovery is mostly a stressful job when your primary location is flooded or burned and lots of things can go wrong if recovery is done manually. This is why Microsoft designed Azure Site Recovery. Azure Site Recovery is able to assist in disaster recovery in several scenarios: A customer has two data centers both running Hyper-V managed by System Center Virtual Machine Manager. Hyper-V Replica is used to replicate data at the virtual machine level. A customer has two data centers both running Hyper-V managed by System Center Virtual Machine Manager. NetApp storage is used to replicate between two sites at the storage level. A customer has a single data center running Hyper-V managed by System Center Virtual Machine Manager. A customer has two data centers both running VMware vSphere. In this case InMage Scout software is used to replicate between two datacenters. Azure is not used for orchestration. A customer has a single data centers not managed by System Center Virtual Machine Manager. In the second scenario, Microsoft Azure is used as a secondary data center if a disaster makes the primary data center unavailable. Microsoft announced also to support a scenario where vSphere is used on-premises and Azure Site Recovery can be used to replicate data to Azure. To enable this InMage software will be used. Details were not available at the time this article was written. In the first two described scenarios Site Recovery is used to orchestrate the failover and failback to the secondary location. The management is done using Azure Management Portal. This is available using any browser supporting HTML5. So a failover can be initiated even from a tablet or smartphone. Using Azure as a secondary data center for disaster recovery Azure Site Recovery went into preview in June 2014. For organizations using Hyper-V, there is no direct need to have a secondary data center as Azure can be used as a target for Hyper-V Replica. Some of the characteristics of the service are as follows: Allows nondisruptive disaster recovery failover testing Automated reconfigure of network configuration of guests Storage agnostic supports any type of on-premises storage supported by Hyper-V Support for VSS to enable application consistency Protects more than 1,000 virtual machines (Microsoft tested with 2,000 virtual machines and this went well) To be able to use Site Recovery, customers do not have to use System Center Virtual Machine Manager. Site Recovery can be used without this installed. System Center Virtual Machine Manager. Site Recovery will use information such as? virtual networks provided by SCVMM to map networks available in Microsoft Azure. Site Recovery does not support the ability to send a copy of the virtual hard disks on removable media to an Azure data center to prevent the initial replication using WAN (seeding). Customers will need to transfer all the replication data over the network. ExpressRoute will help to get a much better throughput compared to a site-to-site VPN over the Internet. Failover to Azure can be as simple as clicking on a single button. Site Recovery will then create new virtual machines in Azure and start the virtual machines in the order defined in the recovery plan. A recovery plan is a workflow that defines the startup sequence of virtual machines. It is possible to stop the recovery plan to allow a manual check, for example. If all is okay, the recovery plan will continue doing its job. Multiple recovery plans can be created. Microsoft Volume Shadow Copy Services (VSS) is supported. This allows application consistency. Replication of data can be configured at intervals of 15 seconds, 5 minutes, or 15 minutes. Replication is performed asynchronously. For recovery, 24 recovery points are available. These are like snapshots or point-in-time copies. If the most recent replica cannot be used (for example, because of damaged data), another replica can be used for restore. You can configure extended replication. In extended replication, your Replica server forwards changes that occur on the primary virtual machines to a third server (the extended Replica server). After a planned or unplanned failover from the primary server to the Replica server, the extended Replica server provides further business continuity protection. As with ordinary replication, you configure extended replication by using Hyper-V Manager, Windows PowerShell (using the –Extended option), or WMI. At the moment, only VHD virtual disk format is supported. Generation 2 virtual machines that can be created on Hyper-V are not supported by Site Recovery. Generation 2 virtual machines have a simplified virtual hardware model and support Unified Extensible Firmware Interface (UEFI) firmware instead of BIOS-based firmware. Also, boot from PXE, SCSI hard disk, SCSCI DVD, and Secure Boot are supported in Generation 2 virtual machines. However on March 19 Microsoft responded to numerous customer requests on support of Site Recovery for Generation 2 virtual machines. Site Recovery will soon support Gen 2 VM's. On failover, the VM will be converted to a Gen 1 VM. On failback, the VM will be converted to Gen 2. This conversion is done till the Azure platform natively supports Gen 2 VM's. Customers using Site Recovery are charged only for consumption of storage as long as they do not perform a failover or failover test. Failback is also supported. After running for a while in Microsoft Azure customers are likely to move their virtual machines back to the on-premises, primary data center. Site Recovery will replicate back only the changed data. Mind that customer data is not stored in Microsoft Azure when Hyper-V Recovery Manager is used. Azure is used to coordinate the failover and recovery. To be able to do this, it stores information on network mappings, runbooks, and names of virtual machines and virtual networks. All data sent to Azure is encrypted. By using Azure Site Recovery, we can perform service orchestration in terms of replication, planned failover, unplanned failover, and test failover. The entire engine is powered by Azure Site Recovery Manager. Let's have a closer look on the main features of Azure Site Recovery. It enables three main scenarios: Test Failover or DR Drills: Enable support for application testing by creating test virtual machines and networks as specified by the user. Without impacting production workloads or their protection, HRM can quickly enable periodic workload testing. Planned Failovers (PFO): For compliance or in the event of a planned outage, customers can use planned failovers, virtual machines are shutdown, final changes are replicated to ensure zero data loss, and then virtual machines are brought up in order on the recovery site as specified by the RP. More importantly, failback is a single-click gesture that executes a planned failover in the reverse direction. Unplanned Failovers (UFO): In the event of unplanned outage or a natural disaster, HRM opportunistically attempts to shut down the primary machines if some of the virtual machines are still running when the disaster strikes. It then automates their recovery on the secondary site as specified by the RP. If your secondary site uses a different IP subnet, Site Recovery is able to change the IP configuration of your virtual machines during the failover. Part of the Site Recovery installation is the installation of a VMM provider. This component communicates with Microsoft Azure. Site Recovery can be used even if you have a single VMM to manage both primary and secondary sites. Site Recovery does not rely on availability of any component in the primary site when performing a failover. So it doesn't matter if the complete site including link to Azure has been destroyed, as Site Recovery will be able to perform the coordinated failover. Azure Site Recovery to customer owned sites is billed per protected virtual machine per month. The costs are approximately €12 per month. Microsoft bills for the average consumption of virtual machines per month. So if you are protecting 20 virtual machines in the first half and 0 in the second half, you will be charged for 10 virtual machines for that month. When Azure is used as a target, Microsoft will only charge for consumption of storage during replication. The costs for this scenario are €40.22/month per instance protected. As soon as you perform a test failover or actual failover Microsoft will charge for the virtual machine CPU and memory consumption. Summary Thus this article has covered the concepts of High Availability in Microsoft Azure and disaster recovery using Azure Site Recovery, and also gives an introduction to the concept of geo-replication. Resources for Article: Further resources on this subject: Windows Azure Mobile Services - Implementing Push Notifications using [article] Configuring organization network services [article] Integration with System Center Operations Manager 2012 SP1 [article]
Read more
  • 0
  • 0
  • 2575

article-image-creating-and-managing-vmfs-datastores
Packt
05 Mar 2015
5 min read
Save for later

Creating and Managing VMFS Datastores

Packt
05 Mar 2015
5 min read
In this article by Abhilash G B, author of VMware vSphere 5.5 Cookbook, we will learn how to expand or grow a VMFS datastore with the help of two methods: using the Increase Datastore Capacity wizard and using the ESXi CLI tool vmkfstools. (For more resources related to this topic, see here.) Expanding/growing a VMFS datastore It is likely that you would run out of free space on a VMFS volume over time as you end up deploying more and more VMs on it, especially in a growing environment. Fortunately, accommodating additional free space on a VMFS volume is possible. However, this requires that the LUN either has free space left on it or it has been expanded/resized in the storage array. The procedure to resize/expand the LUN in the storage array differs from vendor to vendor, we assume that the LUN either has free space on it or has already been expanded. The following flowchart provides a high-level overview of the procedure: How to do it... We can expand a VMFS datastore using two methods: Using the Increase Datastore Capacity wizard Using the ESXi CLI tool vmkfstools Before attempting to grow the VMFS datastore, issue a rescan on the HBAs to ensure that the ESXi sees the increased size of the LUN. Also, make note of the NAA ID, LUN number, and the size of the LUN backing the VMFS datastore that you are trying to expand/grow. Using the Increase Datastore Capacity wizard We will go through the following process to expand an existing VMFS datastore using the vSphere Web Client's GUI. Use the vSphere Web Client to connect to vCenter Server. Navigate to Home | Storage. With the data center object selected, navigate to Related Objects | Datastores: Right-click on the datastore you intend to expand and click on Increase Datastore Capacity...:  Select the LUN backing the datastore and click on Next:  Use the Partition Configuration drop-down menu to select the free space left in DS01 to expand the datastore: On the Ready to Complete screen, review the information and click on Finish to expand the datastore: Using the ESXi CLI tool vmkfstools A VMFS volume can also be expanded using the vmkfstools tool. As with the use of any command-line tool, it can sometimes become difficult to remember the process if you are not doing it often enough to know it like the back of your hand. Hence, I have devised the following flowchart to provide an overview of the command-line steps that needs to be taken to expand a VMFS volume: Now that we know what the order of the steps would be from the flowchart, let's delve right into the procedure: Identify the datastore you want to expand using the following command, and make a note of the corresponding NAA ID: esxcli storage vmfs extent list Here, the NAA ID corresponding to the DS01 datastore is naa.6000eb30adde4c1b0000000000000083. Verify if the ESXi sees the new size of the LUN backing the datastore by issuing the following command: esxcli storage core device list -d naa.6000eb30adde4c1b0000000000000083 Get the current partition table information using the following command:Syntax: partedUtil getptbl "Devfs Path of the device" Command: partedUtil getptbl /vmfs/devices/disks/ naa.6000eb30adde4c1b0000000000000083 Calculate the new last sector value. Moving the last sector value closer to the total sector value is necessary in order to use additional space.The formula to calculate the last sector value is as follows: (Total number of sectors) – (Start sector value) = Last sector value So, the last sector value to be used is as follows: (31457280 – 2048) = 31455232 Resize the VMFS partition by issuing the following command:Syntax: partedUtil resize "Devfs Path" PartitionNumber NewStartingSector NewEndingSector Command: partedUtil resize /vmfs/devices/disks/ naa.6000eb30adde4c1b0000000000000083 1 2048 31455232 Issue the following command to grow the VMFS filesystem:Command syntax: vmkfstools –-growfs <Devfs Path: Partition Number> <Same Devfs Path: Partition Number> Command: vmkfstools --growfs /vmfs/devices/disks/ naa.6000eb30adde4c1b0000000000000083:1 /vmfs/devices/disks/ naa.6000eb30adde4c1b0000000000000083:1 Once the command is executed successfully, it will take you back to the root prompt. There is no on-screen output for this command. How it works... Expanding a VMFS datastore refers to the act of increasing its size within its own extent. This is possible only if there is free space available immediately after the extent. The maximum size of a LUN is 64 TB, so the maximum size of a VMFS volume is also 64 TB. The virtual machines hosted on this VMFS datastore can continue to be in the power-on state while this task is being accomplished. Summary This article walks you through the process of creating and managing VMFS datastores. Resources for Article: Further resources on this subject: Introduction Vsphere Distributed Switches? [article] Introduction Vmware Horizon Mirage [article] Backups Vmware View Infrastructure [article]
Read more
  • 0
  • 0
  • 3440

article-image-ceph-instant-deployment
Packt
09 Feb 2015
14 min read
Save for later

Ceph Instant Deployment

Packt
09 Feb 2015
14 min read
In this article by Karan Singh, author of the book, Learning Ceph, we will cover the following topics: Creating a sandbox environment with VirtualBox From zero to Ceph – deploying your first Ceph cluster Scaling up your Ceph cluster – monitor and OSD addition (For more resources related to this topic, see here.) Creating a sandbox environment with VirtualBox We can test deploy Ceph in a sandbox environment using Oracle VirtualBox virtual machines. This virtual setup can help us discover and perform experiments with Ceph storage clusters as if we are working in a real environment. Since Ceph is an open source software-defined storage deployed on top of commodity hardware in a production environment, we can imitate a fully functioning Ceph environment on virtual machines, instead of real-commodity hardware, for our testing purposes. Oracle VirtualBox is a free software available at http://www.virtualbox.org for Windows, Mac OS X, and Linux. We must fulfil system requirements for the VirtualBox software so that it can function properly during our testing. We assume that your host operating system is a Unix variant; for Microsoft windows, host machines use an absolute path to run the VBoxManage command, which is by default c:Program FilesOracleVirtualBoxVBoxManage.exe. The system requirement for VirtualBox depends upon the number and configuration of virtual machines running on top of it. Your VirtualBox host should require an x86-type processor (Intel or AMD), a few gigabytes of memory (to run three Ceph virtual machines), and a couple of gigabytes of hard drive space. To begin with, we must download VirtualBox from http://www.virtualbox.org/ and then follow the installation procedure once this has been downloaded. We will also need to download the CentOS 6.4 Server ISO image from http://vault.centos.org/6.4/isos/. To set up our sandbox environment, we will create a minimum of three virtual machines; you can create even more machines for your Ceph cluster based on the hardware configuration of your host machine. We will first create a single VM and install OS on it; after this, we will clone this VM twice. This will save us a lot of time and increase our productivity. Let's begin by performing the following steps to create the first virtual machine: The VirtualBox host machine used throughout in this demonstration is a Mac OS X which is a UNIX-type host. If you are performing these steps on a non-UNIX machine that is, on Windows-based host then keep in mind that virtualbox hostonly adapter name will be something like VirtualBox Host-Only Ethernet Adapter #<adapter number>. Please run these commands with the correct adapter names. On windows-based hosts, you can check VirtualBox networking options in Oracle VM VirtualBox Manager by navigating to File | VirtualBox Settings | Network | Host-only Networks. After the installation of the VirtualBox software, a network adapter is created that you can use, or you can create a new adapter with a custom IP:For UNIX-based VirtualBox hosts # VBoxManage hostonlyif remove vboxnet1 # VBoxManage hostonlyif create # VBoxManage hostonlyif ipconfig vboxnet1 --ip 192.168.57.1 --netmask 255.255.255.0 For Windows-based VirtualBox hosts # VBoxManage.exe hostonlyif remove "VirtualBox Host-Only Ethernet Adapter" # VBoxManage.exe hostonlyif create # VBoxManage hostonlyif ipconfig "VirtualBox Host-Only Ethernet Adapter" --ip 192.168.57.1 --netmask 255.255.255. VirtualBox comes with a GUI manager. If your host is running Linux OS, it should have the X-desktop environment (Gnome or KDE) installed. Open Oracle VM VirtualBox Manager and create a new virtual machine with the following specifications using GUI-based New Virtual Machine Wizard, or use the CLI commands mentioned at the end of every step: 1 CPU 1024 MB memory 10 GB X 4 hard disks (one drive for OS and three drives for Ceph OSD) 2 network adapters CentOS 6.4 ISO attached to VM The following is the step-by-step process to create virtual machines using CLI commands: Create your first virtual machine: # VBoxManage createvm --name ceph-node1 --ostype RedHat_64 --register # VBoxManage modifyvm ceph-node1 --memory 1024 --nic1 nat --nic2 hostonly --hostonlyadapter2 vboxnet1 For Windows VirtualBox hosts: # VBoxManage.exe modifyvm ceph-node1 --memory 1024 --nic1 nat --nic2 hostonly --hostonlyadapter2 "VirtualBox Host-Only Ethernet Adapter" Create CD-Drive and attach CentOS ISO image to first virtual machine: # VBoxManage storagectl ceph-node1 --name "IDE Controller" --add ide --controller PIIX4 --hostiocache on --bootable on # VBoxManage storageattach ceph-node1 --storagectl "IDE Controller" --type dvddrive --port 0 --device 0 --medium CentOS-6.4-x86_64-bin-DVD1.iso Make sure you execute the preceding command from the same directory where you have saved CentOS ISO image or you can specify the location where you saved it. Create SATA interface, OS hard drive and attach them to VM; make sure the VirtualBox host has enough free space for creating vm disks. If not, select the host drive which have free space: # VBoxManage storagectl ceph-node1 --name "SATA Controller" --add sata --controller IntelAHCI --hostiocache on --bootable on # VBoxManage createhd --filename OS-ceph-node1.vdi --size 10240 # VBoxManage storageattach ceph-node1 --storagectl "SATA Controller" --port 0 --device 0 --type hdd --medium OS-ceph-node1.vdi Create SATA interface, first ceph disk and attach them to VM: # VBoxManage createhd --filename ceph-node1-osd1.vdi --size 10240 # VBoxManage storageattach ceph-node1 --storagectl "SATA Controller" --port 1 --device 0 --type hdd --medium ceph-node1-osd1.vdi Create SATA interface, second ceph disk and attach them to VM: # VBoxManage createhd --filename ceph-node1-osd2.vdi --size 10240 # VBoxManage storageattach ceph-node1 --storagectl "SATA Controller" --port 2 --device 0 --type hdd --medium ceph-node1-osd2.vdi Create SATA interface, third ceph disk and attach them to VM: # VBoxManage createhd --filename ceph-node1-osd3.vdi --size 10240 # VBoxManage storageattach ceph-node1 --storagectl "SATA Controller" --port 3 --device 0 --type hdd --medium ceph-node1-osd3.vdi Now, at this point, we are ready to power on our ceph-node1 VM. You can do this by selecting the ceph-node1 VM from Oracle VM VirtualBox Manager, and then clicking on the Start button, or you can run the following command: # VBoxManage startvm ceph-node1 --type gui As soon as you start your VM, it should boot from the ISO image. After this, you should install CentOS on VM. If you are not already familiar with Linux OS installation, you can follow the documentation at https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Installation_Guide/index.html. Once you have successfully installed the operating system, edit the network configuration of the machine: Edit /etc/sysconfig/network and change the hostname parameter HOSTNAME=ceph-node1 Edit the /etc/sysconfig/network-scripts/ifcfg-eth0 file and add: ONBOOT=yes BOOTPROTO=dhcp Edit the /etc/sysconfig/network-scripts/ifcfg-eth1 file and add:< ONBOOT=yes BOOTPROTO=static IPADDR=192.168.57.101 NETMASK=255.255.255.0 Edit the /etc/hosts file and add: 192.168.57.101 ceph-node1 192.168.57.102 ceph-node2 192.168.57.103 ceph-node3 Once the network settings have been configured, restart VM and log in via SSH from your host machine. Also, test the Internet connectivity on this machine, which is required to download Ceph packages: # ssh root@192.168.57.101 Once the network setup has been configured correctly, you should shut down your first VM so that we can make two clones of your first VM. If you do not shut down your first VM, the cloning operation might fail. Create clone of ceph-node1 as ceph-node2: # VBoxManage clonevm --name ceph-node2 ceph-node1 --register Create clone of ceph-node1 as ceph-node3: # VBoxManage clonevm --name ceph-node3 ceph-node1 --register After the cloning operation is complete, you can start all three VMs: # VBoxManage startvm ceph-node1 # VBoxManage startvm ceph-node2 # VBoxManage startvm ceph-node3 Set up VM ceph-node2 with the correct hostname and network configuration: Edit /etc/sysconfig/network and change the hostname parameter: HOSTNAME=ceph-node2 Edit the /etc/sysconfig/network-scripts/ifcfg-<first interface name> file and add: DEVICE=<correct device name of your first network interface, check ifconfig -a> ONBOOT=yes BOOTPROTO=dhcp HWADDR= <correct MAC address of your first network interface, check ifconfig -a > Edit the /etc/sysconfig/network-scripts/ifcfg-<second interface name> file and add: DEVICE=<correct device name of your second network interface, check ifconfig -a> ONBOOT=yes BOOTPROTO=static IPADDR=192.168.57.102 NETMASK=255.255.255.0 HWADDR= <correct MAC address of your second network interface, check ifconfig -a > Edit the /etc/hosts file and add: 192.168.57.101 ceph-node1 192.168.57.102 ceph-node2 192.168.57.103 ceph-node3 After performing these changes, you should restart your virtual machine to bring the new hostname into effect. The restart will also update your network configurations. Set up VM ceph-node3 with the correct hostname and network configuration: Edit /etc/sysconfig/network and change the hostname parameter:HOSTNAME=ceph-node3 Edit the /etc/sysconfig/network-scripts/ifcfg-<first interface name> file and add: DEVICE=<correct device name of your first network interface, check ifconfig -a> ONBOOT=yes BOOTPROTO=dhcp HWADDR= <correct MAC address of your first network interface, check ifconfig -a > Edit the /etc/sysconfig/network-scripts/ifcfg-<second interface name> file and add: DEVICE=<correct device name of your second network interface, check ifconfig -a> ONBOOT=yes BOOTPROTO=static IPADDR=192.168.57.103 NETMASK=255.255.255.0 HWADDR= <correct MAC address of your second network interface, check ifconfig -a > Edit the /etc/hosts file and add: 192.168.57.101 ceph-node1 192.168.57.102 ceph-node2 192.168.57.103 ceph-node3 After performing these changes, you should restart your virtual machine to bring a new hostname into effect; the restart will also update your network configurations. At this point, we prepare three virtual machines and make sure each VM communicates with each other. They should also have access to the Internet to install Ceph packages. From zero to Ceph – deploying your first Ceph cluster To deploy our first Ceph cluster, we will use the ceph-deploy tool to install and configure Ceph on all three virtual machines. The ceph-deploy tool is a part of the Ceph software-defined storage, which is used for easier deployment and management of your Ceph storage cluster. Since we created three virtual machines that run CentOS 6.4 and have connectivity with the Internet as well as private network connections, we will configure these machines as Ceph storage clusters as mentioned in the following diagram: Configure ceph-node1 for an SSH passwordless login to other nodes. Execute the following commands from ceph-node1: While configuring SSH, leave the paraphrase empty and proceed with the default settings: # ssh-keygen Copy the SSH key IDs to ceph-node2 and ceph-node3 by providing their root passwords. After this, you should be able to log in on these nodes without a password: # ssh-copy-id ceph-node2 Installing and configuring EPEL on all Ceph nodes: Install EPEL which is the repository for installing extra packages for your Linux system by executing the following command on all Ceph nodes: # rpm -ivh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm Make sure the baserul parameter is enabled under the /etc/yum.repos.d/epel.repo file. The baseurl parameter defines the URL for extra Linux packages. Also make sure the mirrorlist parameter must be disabled (commented) under this file. Problems been observed during installation if the mirrorlist parameter is enabled under epel.repo file. Perform this step on all the three nodes. Install ceph-deploy on the ceph-node1 machine by executing the following command from ceph-node1: # yum install ceph-deploy Next, we will create a Ceph cluster using ceph-deploy by executing the following command from ceph-node1: # ceph-deploy new ceph-node1 ## Create a directory for ceph # mkdir /etc/ceph # cd /etc/ceph The new subcommand of ceph-deploy deploys a new cluster with ceph as the cluster name, which is by default; it generates a cluster configuration and keying files. List the present working directory; you will find the ceph.conf and ceph.mon.keyring files. In this testing, we will intentionally install the Emperor release (v0.72) of Ceph software, which is not the latest release. Later in this book, we will demonstrate the upgradation of Emperor to Firefly release of Ceph. To install Ceph software binaries on all the machines using ceph-deploy; execute the following command from ceph-node1:ceph-deploy install --release emperor ceph-node1 ceph-node2 ceph-node3 The ceph-deploy tool will first install all the dependencies followed by the Ceph Emperor binaries. Once the command completes successfully, check the Ceph version and Ceph health on all the nodes, as follows: # ceph –v Create your first monitor on ceph-node1: # ceph-deploy mon create-initial Once monitor creation is successful, check your cluster status. Your cluster will not be healthy at this stage: # ceph status Create an object storage device (OSD) on the ceph-node1 machine, and add it to the Ceph cluster executing the following steps: List the disks on VM: # ceph-deploy disk list ceph-node1 From the output, carefully identify the disks (other than OS-partition disks) on which we should create Ceph OSD. In our case, the disk names will ideally be sdb, sdc, and sdd. The disk zap subcommand will destroy the existing partition table and content from the disk. Before running the following command, make sure you use the correct disk device name. # ceph-deploy disk zap ceph-node1:sdb ceph-node1:sdc ceph-node1:sdd The osd create subcommand will first prepare the disk, that is, erase the disk with a filesystem, which is xfs by default. Then, it will activate the disk's first partition as data partition and second partition as journal: # ceph-deploy osd create ceph-node1:sdb ceph-node1:sdc ceph-node1:sdd Check the cluster status for new OSD entries: # ceph status At this stage, your cluster will not be healthy. We need to add a few more nodes to the Ceph cluster so that it can set up a distributed, replicated object storage, and hence become healthy. Scaling up your Ceph cluster – monitor and OSD addition Now we have a single-node Ceph cluster. We should scale it up to make it a distributed, reliable storage cluster. To scale up a cluster, we should add more monitor nodes and OSD. As per our plan, we will now configure ceph-node2 and ceph-node3 machines as monitor as well as OSD nodes. Adding the Ceph monitor A Ceph storage cluster requires at least one monitor to run. For high availability, a Ceph storage cluster relies on an odd number of monitors that's more than one, for example, 3 or 5, to form a quorum. It uses the Paxos algorithm to maintain quorum majority. Since we already have one monitor running on ceph-node1, let's create two more monitors for our Ceph cluster: The firewall rules should not block communication between Ceph monitor nodes. If they do, you need to adjust the firewall rules in order to let monitors form a quorum. Since this is our test setup, let's disable firewall on all three nodes. We will run these commands from the ceph-node1 machine, unless otherwise specified: # service iptables stop # chkconfig iptables off # ssh ceph-node2 service iptables stop # ssh ceph-node2 chkconfig iptables off # ssh ceph-node3 service iptables stop # ssh ceph-node3 chkconfig iptables off Deploy a monitor on ceph-node2 and ceph-node3: # ceph-deploy mon create ceph-node2 # ceph-deploy mon create ceph-node3 The deploy operation should be successful; you can then check your newly added monitors in the Ceph status: You might encounter warning messages related to clock skew on new monitor nodes. To resolve this, we need to set up Network Time Protocol (NTP) on new monitor nodes: # chkconfig ntpd on # ssh ceph-node2 chkconfig ntpd on # ssh ceph-node3 chkconfig ntpd on # ntpdate pool.ntp.org # ssh ceph-node2 ntpdate pool.ntp.org # ssh ceph-node3 ntpdate pool.ntp.org # /etc/init.d/ntpd start # ssh ceph-node2 /etc/init.d/ntpd start # ssh ceph-node3 /etc/init.d/ntpd start Adding the Ceph OSD At this point, we have a running Ceph cluster with three monitors OSDs. Now we will scale our cluster and add more OSDs. To accomplish this, we will run the following commands from the ceph-node1 machine, unless otherwise specified. We will follow the same method for OSD addition: # ceph-deploy disk list ceph-node2 ceph-node3 # ceph-deploy disk zap ceph-node2:sdb ceph-node2:sdc ceph-node2:sdd # ceph-deploy disk zap ceph-node3:sdb ceph-node3:sdc ceph-node3:sdd # ceph-deploy osd create ceph-node2:sdb ceph-node2:sdc ceph-node2:sdd # ceph-deploy osd create ceph-node3:sdb ceph-node3:sdc ceph-node3:sdd # ceph status Check the cluster status for a new OSD. At this stage, your cluster will be healthy with nine OSDs in and up: Summary The software-defined nature of Ceph provides a great deal of flexibility to its adopters. Unlike other proprietary storage systems, which are hardware dependent, Ceph can be easily deployed and tested on almost any computer system available today. Moreover, if getting physical machines is a challenge, you can use virtual machines to install Ceph, as mentioned in this article, but keep in mind that such a setup should only be used for testing purposes. In this article, we learned how to create a set of virtual machines using the VirtualBox software, followed by Ceph deployment as a three-node cluster using the ceph-deploy tool. We also added a couple of OSDs and monitor machines to our cluster in order to demonstrate its dynamic scalability. We recommend you deploy a Ceph cluster of your own using the instructions mentioned in this article. Resources for Article: Further resources on this subject: Linux Shell Scripting - various recipes to help you [article] GNU Octave: Data Analysis Examples [article] What is Kali Linux [article]
Read more
  • 0
  • 0
  • 3260
Banner background image

article-image-how-to-run-hadoop-on-google-cloud-2
Robi Sen
15 Dec 2014
7 min read
Save for later

How to Run Hadoop on Google Cloud – Part 2

Robi Sen
15 Dec 2014
7 min read
Setting up and working with Hadoop can sometimes be difficult. Furthermore, most people with limited resources develop on Hadoop instances on Virtual Machines locally or on minimal hardware. The problem with this is that Hadoop is really designed to run on many machines in order to realize its full capabilities. In this two part series of posts (read part 1 here), we will show you how you can quickly get started with Hadoop in the cloud with Google services. In the last part in this series, we installed our Google developer account. Now it is time to install the Google Cloud SDK. Installing the Google Cloud SDK To work with the Google Cloud SDK, we need a Cygwin 32-bit version. Get it here, even if you have a 64-bit processor. The reason for this is that the Python 64-bit version for Windows has issues that make it incompatible with many common Python tools. So you should stick with the 32-bit version. Next, when you install Cygwin, you need to make sure you select Python (note that if you do not install the Cygwin version of Python, your installation will fail), openssh, and curl. You can do this when you get to the package screen by typing openssh or curl in the search bar at top and selecting the package under "net," then by selecting the check box under "Bin" for openssh. Do the same for curl. You should see something like what is shown in Figures 1 and 2 respectively. Figure 1: Adding openssh   Figure 2: Adding curl to Cygwin Now go ahead and start Cygwin by going to Start -> All Programs -> Cygwin -> Cygwin Terminal. Now use curl to install the Google Cloud SDK by typing the following command “$ curl https://sdk.cloud.google.com | bash,” which will install the Google Cloud SDK from the Internet. Follow the prompts to complete the setup. When prompted, if you would like to update your system path, select "y" and when complete, restart Cygwin. After you restart Cygwin, you need to authenticate with the Google Cloud SDK. To do this type "gcloud auth login –no-launch-browser" like in Figure 3.   Figure 3: Authenticating with Google Cloud SDK tools Cloud SDK will then give you a URL that you should copy and paste in your browser. You will then be asked to log in with your Google account and accept the permissions requested by the SDK as in Figure 4.   Figure 4: Google Cloud authorization via OAuth Google will provide you with a verification code that you can cut and paste into the command line and if everything works, you should be logged in. Next, set your project ID for this session by using the command "$ gcloud config set project YOURPROJECTID" as in Figure 5.   Figure 5: Setting your project ID Now you need to download the set of scripts that will help you set up Hadoop in Google Cloud Storage.[1] Make sure you do not close this command-line window because we are going to use it again. Download the Big Data utilities scripts to set up Hadoop in the Cloud here. Once you have downloaded the zip, unpack it and place it in the directory wherever you want. Now, in the command line, type "gsutil mb -p YOURPROJECTID gs://SOMEBUCKETNAME." If everything goes well, you should see something like Figure 6. Figure 6: Creating your Google Cloud Storage bucker YOURPROJECTID is the project ID you created or were assigned earlier and SOMEBUCKETNAME is whatever you want your bucket to be called. Unfortunately, bucket names must be unique. Read more here, so using something like your company domain name and some other unique identifier might be a good idea. If you do not pick a unique name, you will get an error. Now go to the directory where you stored your Big Data Utility Scripts and open bdutil_env.sh in a text editor as in Figure 7.   Figure 7: Editing the bdutil_env.sh file Now add your bucket name for the CONFIGBUCKET  value in the file and your project ID for the PROJECT value like in Figure 8. Now save the file. Figure 8: Editing the bdutil_env.sh file Once you have the bdutil_env.sh file, you need to test that you can reach your compute instances via gcutil and ssh. Let’s walk through that now to set it up so you can do it in the future. In Cygwin, create a test instance to play with and set up gcutil by typing the command "gcutil addinstance mytest," then hit Enter. You will be asked to select a time zone (I selected 6), a number of processors, and the like. Go ahead and select the items you want since after we create this instance and connect to it, we will delete it. After you walk through the setup steps, Google will create your instance. During the creation, you will be asked for a passphrase. Make sure you use a passphrase you can remember. Now, in the command line, type "gcutil ssh mytest." This will now try to connect to your "mytest" instance via SSH, and if it’s the first time you have done this, you will be asked to type in a passphrase. Do not type a passphrase; just leave it blank and select Enter. This will then create a public and private ssh key. If everything works, you should now connect to the instance and you will know gcutil ssh is working correct. Go ahead and type "exit" and then "gcutil deleteinstance mytest" and select "y" for all questions. This will trigger the Google Cloud to destroy your test instance. Now in Cygwin, navigate to where you placed the dbutils download. If you are not familiar with Cygwin, you can navigate to any directory on the c drive by using the "cygdrive/c" and then set the Unix style path to your directory. So, for example, on my computer it would look like Figure 9. Figure 9: Navigating to the dbutils folder in Cygwin Now we can attempt a deployment of Haddop by typing "./bdutil deploy" like in Figure 10. Figure 10: Deploying Hadoop The system will now try to deploy your Hadoop instance to the Cloud. You might be prompted to create a staging directory as well while the script is running. Go ahead and type "y" to accept. You should now see a message saying "Deployment complete." It might take several minutes for your job to complete, so be patient. When it is finished, check to see whether your cluster is up by typing in "gcutil listinstances", where you will see something like what is shown in Figure 11. Figure 11: A list of Hadoop instances running From here, you need to test your deployment, which you do via the command "gcutil ssh –project=YOURPROJECTID hs-ghfs-nn < Hadoop-validate-setup.sh" like in Figure 12. Figure 12: Validating Hadoop deployment If the script runs successfully, you should see an output like "teragen, terasort, teravalidate passed." From there, go ahead and delete the project by typing "./bdutil delete." This will delete the deployed virtual machines (VMs) and associated artifacts. When it’s done, you should see message "Done deleting VMs!" Summary In this two part blog post series, you learned how to use the Google Cloud SDK to set up Hadoop via Windows and Cygwin. Now you have Cygwin set up and configured to build, connect to the Google Cloud, set up instances, and deploy Hadoop. If you want even more Hadoop content, visit our Hadoop page. Featuring our latest releases and our top free Hadoop content, it's the centre of Packt's Big Data coverage. About the author Robi Sen, CSO at Department 13, is an experienced inventor, serial entrepreneur, and futurist whose dynamic twenty-plus year career in technology, engineering, and research has led him to work on cutting edge projects for DARPA, TSWG, SOCOM, RRTO, NASA, DOE, and the DOD. Robi also has extensive experience in the commercial space, including the co-creation of several successful start-up companies. He has worked with companies such as Under Armour, Sony, CISCO, IBM, and many others to help build out new products and services. Robi specializes in bringing his unique vision and thought process to difficult and complex problems, allowing companies and organizations to find innovative solutions that they can rapidly operationalize or go to market with.
Read more
  • 0
  • 0
  • 2478

article-image-how-to-auto-scale-your-cloud-with-saltstack
Nicole Thomas
15 Dec 2014
10 min read
Save for later

How to Auto-Scale Your Cloud with SaltStack

Nicole Thomas
15 Dec 2014
10 min read
What is SaltStack? SaltStack is an extremely fast, scalable, and powerful remote execution engine and configuration management tool created to control distributed infrastructure, code, and data efficiently. At the heart of SaltStack, or “Salt”, is its remote execution engine, which is a bi-directional, secure communication system administered through the use of a Salt Master daemon. This daemon is used to control Salt Minion daemons, which receive commands from the remote Salt Master. A major component of Salt’s approach to configuration management is Salt Cloud, which was made to manage Salt Minions in cloud environments. The main purpose of Salt Cloud is to spin up instances on cloud providers, install a Salt Minion on the new instance using Salt’s Bootstrap Script, and configure the new minion so it can immediately get to work. Salt Cloud makes it easy to get an infrastructure up and running quickly and supports an array of cloud providers such as OpenStack, Digital Ocean, Joyent, Linode, Rackspace, Amazon EC2, and Google Compute Engine to name a few. Here is a full list of cloud providers supported by SaltStack and the automation features supported for each. What is cloud auto scaling? One of the most formidable benefits of cloud application hosting and data storage is the cloud infrastructure’s capacity to scale as demand fluctuates. Many cloud providers offer auto scaling features that automatically increase or decrease the number of instances that are up and running in a user’s cloud at any given time. These components generate new instances as needed to ensure optimal performance as activity escalates, while during idle periods, instances are destroyed to reduce costs. To harness the power of cloud auto-scaling technologies, SaltStack provides two reactor formulas that integrate Salt’s configuration management and remote execution capabilities for either Amazon EC2 Auto Scaling or Rackspace Auto Scale. The Salt Cloud Reactor Salt Formulas can be very helpful in the rapid build out of management frameworks for cloud infrastructures. Formulas are pre-written Salt States that can be used to configure services, install packages, or any other common configuration management tasks. The Salt Cloud Reactor is a formula that allows Salt to interact with supported Salt Cloud providers who provide cloud auto scaling features. (Note: at the time this article was written, the only supported Salt Cloud providers with cloud auto scaling capabilities were Rackspace Auto Scale and Amazon EC2 Auto Scaling. The Salt Cloud Reactor can also be used directly with EC2 Auto Scaling, but it is recommended that the EC2 Autoscale Reactor be used instead, as discussed in the following section.) The Salt Cloud Reactor allows SaltStack to know when instances are spawned or destroyed by the cloud provider. When a new instance comes online, a Salt Minion is automatically installed and the minion’s key is accepted by the Salt Master. If the configuration for the minion contains the appropriate startup state, it will configure itself and start working on its tasks. Accordingly, when an instance is deleted by the cloud provider, the minion’s key is removed from the Salt Master. In order to use the Salt Cloud Reactor, the Salt Master must be configured appropriately. In addition to applying all necessary settings on the Salt Master, a Salt Cloud query must be executed on a regular basis. The query polls data from the cloud provider to collect changes in the auto scaling sequence, as cloud providers using the Salt Cloud Reactor do not directly trigger notifications to Salt upon instance creation and deletion. The cloud query must be issued via a scheduling system such as cron or the Salt Scheduler. Once the Salt Master has been configured and query scheduling has been implemented, the reactor will manage itself and allow the Salt Master to interact with any Salt Minions created or destroyed by the auto scaling system. The EC2 Autoscale Reactor Salt’s EC2 Autoscale Reactor enables Salt to collaborate with Amazon EC2 Auto Scaling. Similarly to the Salt Cloud Reactor, the EC2 Autoscale Reactor will bootstrap a Salt Minion on any newly created instances and the Salt Master will automatically accept the new minion’s key. Additionally, when an EC2 instance is destroyed, the Salt Minion’s key will be automatically removed from the Salt Master. However, the EC2 Auto Scale Reactor formula differs from the Salt Cloud Reactor formula in one major way. Amazon EC2 provides notifications directly to the reactor when the EC2 cloud is scaled up or down, making it easy for Salt to immediately bootstrap new instances with a Salt Minion, or to delete old Salt Minion keys from the master. This behavior, therefore, does not require any kind of scheduled query to poll EC2 for changes in scale like the Salt Cloud Reactor demands. Changes to the EC2 cloud can be acted upon by the Salt Master immediately, whereas changes in clouds using the Salt Cloud Reactor may experience a delay in the instance being created and the Salt Master bootstrapping the instance with a new minion. Configuring the EC2 Autoscale Reactor Both of the cloud auto scaling reactors were only recently added to the SaltStack arsenal, and as such, the Salt develop branch is required to set up auto any scaling capabilities. To get started, clone the Salt repository from GitHub onto the machine serving as the Salt Master: git clone https://github.com/saltstack/salt Depending on the operating system you are using, there are a few dependencies that also need to be installed to run SaltStack from the develop branch. Check out the Installing Salt for Development documentation for OS-specific instructions. Once Salt has been installed for development, the Salt Master needs to be configured. First, create the default salt directory in /etc : mkdir /etc/salt The default Salt Master configuration file resides in salt/conf/master. Copy this file into the new salt directory: cp path/to/salt/conf/master /etc/salt/master The Salt Master configuration file is completely commented out, as the default configuration for the master will work on most systems. However, some additional settings must be configured to enable the EC2 Autoscale Reactor to work with the Salt Master. Under the external_auth section of the master configuration file, replace the commented out lines with the following: external_auth:   pam:     myuser:       - .*       - ‘@runner’       - ‘@wheel’ rest_cherrypy:   port: 8080   host: 0.0.0.0   webhook_url: /hook   webhook_disable_auth: True reactor:   - ‘salt/netapi/hook/ec2/autoscale’:     - ‘/srv/reactor/ec2-autoscale.sls’ ec2.autoscale:   provider: my-ec2-config   ssh_username: ec2-user These settings allow the Salt API web hook system to interact with EC2. When a web request is received from EC2, the Salt API will execute an event for the reactor system to respond to. The final ec2.autoscale setting points the reactor to the corresponding Salt Cloud provider configuration file. If authenticity problems with the reactor’s web hook occur, an email notification from Amazon will be sent to the user. To configure the Salt Master to connect to a mail server, see the example SMTP settings in the EC2 Autoscale Reactor documentation. Next, the Salt Cloud provider configuration file must be created. First, create the cloud provider configuration directory: mkdir /etc/salt/cloud.providers.d In /etc/salt/cloud.providers.d, create a file named ec2.conf, and set the following configurations according to your Amazon EC2 account: my-ec2-config:   id: <my aws id>   key: <my aws key>   keyname: <my aws key name>   securitygroup: <my aws security group>   private_key: </path/to/my/private_key.pem>   location: us-east-1   provider: ec2   minion:     master: saltmaster.example.com The last line, master: saltmaster.example.com, represents the location of the Salt Master so the Salt Minions know where to connect once it’s up and running. To set up the actual reactor, create a new reactor directory, download the ec2-autoscale-reactor formula, and copy the reactor formula into the new directory, like so: mkdir /srv/reactor cp path/to/downloaded/package/ec2-autoscale.sls /srv/reactor/ec2-autoscale.sls The last major configuration step is to configure all of the appropriate settings on your EC2 account. First, log in to your AWS account and set up SNS HTTP(S) notifications by selecting SNS (Push Notification Service) from the AWS Console. Click Create New Topic, enter a topic name and a display name, and click the Create Topic button. Then, inside the Topic Details area, click Create Subscription. Choose HTTP or HTTPS as needed and enter the web hook for the Salt API. Assuming your Salt Master is set up at https://saltmaster.example.com, the final web hook endpoint will be: https://saltmaster.example.com/hook/ec2/autoscale. Finally, click Subscribe. Next, set up the launch configurations by choosing EC2 (Virtual Servers in the Cloud) from the AWS Console. Then, select Launch Configurations on the left-hand side. Click Create Launch Configuration and follow the prompts to define the appropriate settings for your cloud. Finally, on the review screen, click Create Launch Configuration to save your settings. Once the launch configuration is set up, click Auto Scaling Groups from the left-hand navigation menu to create auto scaling variables such as the minimum and maximum number of instances your cloud should contain. Click Create Auto Scaling Group, choose Create an Auto Scaling group from an existing launch configuration, select the appropriate configuration, and then click Next Step. From there, follow the prompts until you reach the Configure Notifications screen. Click Add Notification and choose the notification setting that was configured during the SNS configuration step. Finally, complete the rest of the prompts. Congratulations! At this point, you should have successfully configured SaltStack to work with EC2 Auto Scaling! Salt Scheduler As mentioned in the Salt Cloud Reactor section, some type of scheduling system must be implemented when using the Salt Cloud Reactor formula. SaltStack provides its own scheduler, which can be used by adding the following state to the Salt Master’s configuration file: schedule:   job1:     function: cloud.full_query     seconds: 300 Here, the seconds setting ensures that the Salt Master will perform a salt-cloud --full-query command every 5 minutes. A minimum value of 300 seconds or greater is recommended, however, the value can be changed as necessary. Salting instances from the web interface Another exciting quality of Salt’s auto-scale reactor formulas is once a reactor is configured, the respective cloud provider web interface can be used to spin up new instances that are automatically “Salted”. Since the reactor integrates with the web interface to automatically install a Salt Minion on any new instances, it will perform the same operations when instances are created manually via the web interface. The same functionality is true for manually deleting instances: if an instance is manually destroyed on the web interface, the corresponding minion’s key will be removed from the Salt Master. More resources For troubleshooting, more configuration options, or SaltStack specifics, SaltStack has many helpful resources such as SaltStack, Salt Cloud, Salt Cloud Reactor, and EC2 Autoscale Reactor documentation. SaltStack also has a thriving, active, and friendly open source community. About the Author Nicole Thomas is a QA Engineer at SaltStack, Inc. Before coming to SaltStack, she wore many hats from web and Android developer to contributing editor to working in Environmental Education. Nicole recently graduated Summa Cum Laude from Westminster College with a degree in Computer Science. Nicole also has a degree in Environmental Studies from the University of Utah.
Read more
  • 0
  • 0
  • 5555

article-image-how-to-run-hadoop-on-google-cloud-1
Robi Sen
15 Dec 2014
4 min read
Save for later

How to Run Hadoop on Google Cloud – Part 1

Robi Sen
15 Dec 2014
4 min read
Setting up and working with Hadoop can sometimes be difficult. Furthermore, most people with limited resources develop on Hadoop instances on Virtual Machines locally or on minimal hardware. The problem with this is that Hadoop is really designed to run on many machines in order to realize its full capabilities. In this two part series of posts, we will show you how you can get started with Hadoop in the cloud with Google services quickly and relatively easily. Getting Started The first thing you need in order to follow along is a Google account. If you don’t have a Google account, you can sign up here: https://accounts.google.com/SignUp. Next, you need to create a Google Compute and Google Cloud storage enabled project via the Google Developers Console. Let’s walk through that right now. First go to the Developer Console and log in using your Google account. You will need your credit card as part of this process; however, to complete this two part post series, you will not need to spend any money. Once you have logged in, you should see something like what is shown in Figure 1. Figure 1: Example view of Google Developers Console Now select Create Project. This will pop up the create new project windows, as shown in Figure 2. In the project name field, go ahead and name your project HadoopTutorial. For the Project ID, Google will assign you a random project ID or you can try to select your own. Whatever your project ID is, just make note of it since we will be using it later. If, however, you forget your project ID, you can just come back to the Google console to look it up. You do not need to select the first checkbox shown in Figure 2, but go ahead and check the second checkbox, which is the terms of service. Now select Create. Figure 2: New Project window When you select Create, be prepared for a small delay as Google builds your project. When it is done, you should see a screen like that shown in Figure 3.   Figure 3: Project Dashboard Now click on Enable an API. You should now see the APIs screen. Make sure you check to see whether the Google Cloud Storage and Google Cloud Storage JSON API options are enabled, that is, showing a green ON button. Now scroll down and find the Google Compute Engine and select the OFF button to enable it like the one shown in Figure 4. If you don’t have a payment account set up on Google, you will be asked to do that now and put in a valid credit card. Once that is done, you can go back and enable the Google Compute Engine.   Figure 4: Setting up your Google APIs     You should now have your Google developer account up and running. In the next post, I will walk you through the installation of the Google Cloud SDK and setting up Hadoop via Windows and Cygwin. Read part 2 here. Want more Hadoop content? Check out our dynamic Hadoop page, updated with our latest titles and most popular content. About the author Robi Sen, CSO at Department 13, is an experienced inventor, serial entrepreneur, and futurist whose dynamic twenty-plus year career in technology, engineering, and research has led him to work on cutting edge projects for DARPA, TSWG, SOCOM, RRTO, NASA, DOE, and the DOD. Robi also has extensive experience in the commercial space, including the co-creation of several successful start-up companies. He has worked with companies such as Under Armour, Sony, CISCO, IBM, and many others to help build out new products and services. Robi specializes in bringing his unique vision and thought process to difficult and complex problems, allowing companies and organizations to find innovative solutions that they can rapidly operationalize or go to market with.
Read more
  • 0
  • 0
  • 2231
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-managing-heroku-command-line
Packt
20 Nov 2014
27 min read
Save for later

Managing Heroku from the Command Line

Packt
20 Nov 2014
27 min read
In this article by Mike Coutermarsh, author of Heroku Cookbook, we will cover the following topics: Viewing application logs Searching logs Installing add-ons Managing environment variables Enabling the maintenance page Managing releases and rolling back Running one-off tasks and dynos Managing SSH keys Sharing and collaboration Monitoring load average and memory usage (For more resources related to this topic, see here.) Heroku was built to be managed from its command-line interface. The better we learn it, the faster and more effective we will be in administering our application. The goal of this article is to get comfortable with using the CLI. We'll see that each Heroku command follows a common pattern. Once we learn a few of these commands, the rest will be relatively simple to master. In this article, we won't cover every command available in the CLI, but we will focus on the ones that we'll be using the most. As we learn each command, we will also learn a little more about what is happening behind the scenes so that we get a better understanding of how Heroku works. The more we understand, the more we'll be able to take advantage of the platform. Before we start, let's note that if we ever need to get a list of the available commands, we can run the following command: $ heroku help We can also quickly display the documentation for a single command: $ heroku help command_name Viewing application logs Logging gets a little more complex for any application that is running multiple servers and several different types of processes. Having visibility into everything that is happening within our application is critical to maintaining it. Heroku handles this by combining and sending all of our logs to one place, the Logplex. The Logplex provides us with a single location to view a stream of our logs across our entire application. In this recipe, we'll learn how to view logs via the CLI. We'll learn how to quickly get visibility into what's happening within our application. How to do it… To start, let's open up a terminal, navigate to an existing Heroku application, and perform the following steps: First, to view our applications logs, we can use the logs command: $ heroku logs2014-03-31T23:35:51.195150+00:00 app[web.1]:   Rendered pages/about.html.slim within layouts/application (25.0ms) 2014-03-31T23:35:51.215591+00:00 app[web.1]:   Rendered layouts/_navigation_links.html.erb (2.6ms)2014-03-31T23:35:51.230010+00:00 app[web.1]:   Rendered layouts/_messages.html.slim (13.0ms)2014-03-31T23:35:51.215967+00:00 app[web.1]:   Rendered layouts/_navigation.html.slim (10.3ms)2014-03-31T23:35:51.231104+00:00 app[web.1]: Completed 200 OK in 109ms (Views: 65.4ms | ActiveRecord: 0.0ms)2014-03-31T23:35:51.242960+00:00 heroku[router]: at=info method=GET path= Heroku logs anything that our application sends to STDOUT or STDERR. If we're not seeing logs, it's very likely our application is not configured correctly.  We can also watch our logs in real time. This is known as tailing: $ heroku logs --tail Instead of --tail, we can also use -t. We'll need to press Ctrl + C to end the command and stop tailing the logs. If we want to see the 100 most recent lines, we can use -n: $ heroku logs -n 100 The Logplex stores a maximum of 1500 lines. To view more lines, we'll have to set up a log storage. We can filter the logs to only show a specific process type. Here, we will only see logs from our web dynos: $ heroku logs -p web If we want, we can be as granular as showing the logs from an individual dyno. This will show only the logs from the second web dyno: $ heroku logs -p web.2 We can use this for any process type; we can try it for our workers if we'd like: $ heroku logs -p worker The Logplex contains more than just logs from our application. We can also view logs generated by Heroku or the API. Let's try changing the source to Heroku to only see the logs generated by Heroku. This will only show us logs related to the router and resource usage: $ heroku logs --source heroku To view logs for only our application, we can set the source to app: $ heroku logs --source app We can also view logs from the API. These logs will show any administrative actions we've taken, such as scaling dynos or changing configuration variables. This can be useful when multiple developers are working on an application: $ heroku logs --source api We can even combine the different flags. Let's try tailing the logs for only our web dynos: $ heroku logs -p web --tail That's it! Remember that if we ever need more information on how to view logs via the CLI, we can always use the help command: $ heroku help logs How it works Under the covers, the Heroku CLI is simply passes our request to Heroku's API and then uses Ruby to parse and display our logs. If you're interested in exactly how it works, the code is open source on GitHub at https://github.com/heroku/heroku/blob/master/lib/heroku/command/logs.rb. Viewing logs via the CLI is most useful in situations where we need to see exactly what our application is doing right now. We'll find that we use it a lot around deploys and when debugging issues. Since the Logplex has a limit of 1500 lines, it's not meant to view any historical data. For this, we'll need to set up log drains and enable a logging add-on. Searching logs Heroku does not have the built-in capability to search our logs from the command line. We can get around this limitation easily by making use of some other command-line tools. In this recipe, we will learn how to combine Heroku's logs with Grep, a command-line tool to search text. This will allow us to search our recent logs for keywords, helping us track down errors more quickly. Getting ready For this recipe, we'll need to have Grep installed. For OS X and Linux machines, it should already be installed. We can install Grep using the following steps: To check if we have Grep installed, let's open up a terminal and type the following: $ grepusage: grep [-abcDEFGHhIiJLlmnOoPqRSsUVvwxZ] [-A num] [-B num] [-C[num]]       [-e pattern] [-f file] [--binary-files=value] [--color=when]       [--context[=num]] [--directories=action] [--label] [--line-buffered]       [--null] [pattern] [file ...] If we do not see usage instructions, we can visit http://www.gnu.org/software/grep/ for the download and installation instructions. How to do it… Let's start searching our logs by opening a terminal and navigating to one of our Heroku applications using the following steps: To search for a keyword in our logs, we need to pipe our logs into Grep. This simply means that we will be passing our logs into Grep and having Grep search them for us. Let's try this now. The following command will search the output of heroku logs for the word error: $ heroku logs | grep error Sometimes, we might want to search for a longer string that includes special characters. We can do this by surrounding it with quotes: $ heroku logs | grep "path=/pages/about host" It can be useful to also see the lines surrounding the line that matched our search. We can do this as well. The next command will show us the line that contains an error as well as the three lines above and below it: $ heroku logs | grep error -C 3 We can even search with regular expressions. The next command will show us every line that matches a number that ends with MB. So, for example, lines with 100 MB, 25 MB, or 3 MB will all appear: $ heroku logs | grep 'd*MB' To learn more about regular expressions, visit http://regex.learncodethehardway.org/. How it works… Like most Unix-based tools, Grep was built to accomplish a single task and to do it well. Global regular expression print (Grep) is built to search a set of files for a pattern and then print all of the matches. Grep can also search anything it receives through standard input; this is exactly how we used it in this recipe. By piping the output of our Heroku logs into Grep, we are passing our logs to Grep as standard input. See also To learn more about Grep, visit http://www.tutorialspoint.com/unix_commands/grep.htm Installing add-ons Our application needs some additional functionality provided by an outside service. What should we do? In the past, this would have involved creating accounts, managing credentials, and, maybe, even bringing up servers and installing software. This whole process has been simplified by the Heroku add-on marketplace. For any additional functionality that our application needs, our first stop should always be Heroku add-ons. Heroku has made attaching additional resources to our application a plug-and-play process. If we need an additional database, caching, or error logging, they can be set up with a single command. In this recipe, we will learn the ins and outs of using the Heroku CLI to install and manage our application's add-ons. How to do it... To begin, let's open a terminal and navigate to one of our Heroku applications using the following steps: Let's start by taking a look at all of the available Heroku add-ons. We can do this with the addons:list command: $ heroku addons:list There are so many add-ons that viewing them through the CLI is pretty difficult. For easier navigation and search, we should take a look at https://addons.heroku.com/. If we want to see the currently installed add-ons for our application, we can simply type the following: $ heroku addons=== load-tester-rails Configured Add-onsheroku-postgresql:dev       HEROKU_POSTGRESQL_MAROONheroku-postgresql:hobby-dev HEROKU_POSTGRESQL_ONYXlibrato:developmentnewrelic:stark Remember that for any command, we can always add --app app_name to specify the application. Alternatively, our application's add-ons are also listed through the Heroku Dashboard available at https://dashboard.heroku.com. The installation of a new add-on is done with addons:add. Here, we are going to install the error logging service, Rollbar: $ heroku addons:add rollbarheroku addons:add rollbarAdding rollbar on load-tester-rails... done, v22 (free)Use `heroku addons:docs rollbar` to view documentation. We can quickly open up the documentation for an add-on with addons:docs: $ heroku addons:docs rollbar Removing an add-on is just as simple. We'll need to type our application name to confirm. For this example, our application is called load-tester-rails: $ heroku addons:remove rollbar!   WARNING: Destructive Action!   This command will affect the app: load-tester-rails!   To proceed, type "load-tester-rails" or re-run this command with --confirm load-tester-rails > load-tester-railsRemoving rollbar on load-tester-rails... done, v23 (free) Each add-on comes with different tiers of service. Let's try upgrading our rollbar add-on to the starter tier: $ heroku addons:upgrade rollbar:starterUpgrading to rollbar:starter on load-tester-rails... done, v26 ($12/mo)Plan changed to starterUse `heroku addons:docs rollbar` to view documentation. Now, if we want, we can downgrade back to its original level with addons:downgrade: $ heroku addons:downgrade rollbarDowngrading to rollbar on load-tester-rails... done, v27 (free)Plan changed to freeUse `heroku addons:docs rollbar` to view documentation. If we ever forget any of the commands, we can always use help to quickly see the documentation: $ heroku help addons Some add-ons might charge you money. Before continuing, let's double check that we only have the correct ones enabled, using the $ heroku addons command. How it works… Heroku has created a standardized process for all add-on providers to follow. This ensures a consistent experience when provisioning any add-on for our application. It starts when we request the creation of an add-on. Heroku sends an HTTP request to the provider, asking them to provision an instance of their service. The provider must then respond to Heroku with the connection details for their service in the form of environment variables. For example, if we were to provision Redis To Go, we will get back our connection details in a REDISTOGO_URL variable: REDISTOGO_URL: redis://user:pass@server.redistogo.com:9652 Heroku adds these variables to our application and restarts it. On restart, the variables are available for our application, and we can connect to the service using them. The specifics on how to connect using the variables will be in the add-ons documentation. Installation will depend on the specific language or framework we're using. See also For details on creating our own add-ons, the process is well documented on Heroku's website at https://addons.heroku.com/provider Check out Kensa, the CLI to create Heroku add-ons, at https://github.com/heroku/kensa Managing environment variables Our applications will often need access to various credentials in the form of API tokens, usernames, and passwords for integrations with third-party services. We can store this information in our Git repository, but then, anyone with access to our code will also have a copy of our production credentials. We should instead use environment variables to store any configuration information for our application. Configuration information should be separate from our application's code and instead be tied to the specific deployment of the application. Changing our application to use environment variables is simple. Let's look at an example in Ruby; let's assume that we currently have secret_api_token defined in our application's code: secret_api_token = '123abc' We can remove the token and replace it with an environment variable: secret_api_token = ENV['SECRET_TOKEN'] In addition to protecting our credentials, using environment variables makes our application more configurable. We'll be able to quickly make configuration changes without having to change code and redeploy. The terms "configuration variable" and "environment variable" are interchangeable. Heroku usually uses "configuration" due to how tightly the variables are coupled with the state of the application. How to do it... Heroku makes it easy to set our application's environment variables through the config command. Let's launch a terminal and navigate to an existing Heroku project to try it out, using the following steps: We can use the config command to see a list of all our existing environment variables: $ heroku config To view only the value of a specific variable, we can use get: $ heroku config:get DATABASE_URL To set a new variable, we can use set: $ heroku config:set VAR_NAME=var_valueSetting config vars and restarting load-tester-rails... done, v28VAR_NAME: var_value Each time we set a config variable, Heroku will restart our application. We can set multiple values at once to avoid multiple restarts: $ heroku config:set SECRET=value SECRET2=valueSetting config vars and restarting load-tester-rails... done, v29SECRET: valueSECRET2: value To delete a variable, we use unset: $ heroku config:unset SECRETUnsetting SECRET and restarting load-tester-rails... done, v30 If we want, we can delete multiple variables with a single command: $ heroku config:unset VAR_NAME SECRET2Unsetting VAR_NAME and restarting load-tester-rails... done, v31Unsetting SECRET2 and restarting load-tester-rails... done, v32 Heroku tracks each configuration change as a release. This makes it easy for us to roll back changes if we make a mistake. How it works… Environment variables are used on Unix-based operating systems to manage and share configuration information between applications. As they are so common, changing our application to use them does not lock us into deploying only to Heroku. Heroku stores all of our configuration variables in one central location. Each change to these variables is tracked, and we can view the history by looking through our past releases. When Heroku spins up a new dyno, part of the process is taking all of our configuration settings and setting them as environment variables on the dyno. This is why whenever we make a configuration change, Heroku restarts our dynos. As configuration variables are such a key part of our Heroku application, any change to them will also be included in our Heroku logs. See also Read about the Twelve-Factor app's rule on configuration at http://12factor.net/config Enabling the maintenance page Occasionally, we will need to make changes to our application that requires downtime. The proper way to do this is to put up a maintenance page that displays a friendly message and respond to all the incoming HTTP requests with a 503 Service Unavailable status. Doing this will keep our users informed and also avoid any negative SEO effects. Search engines understand that when they receive a 503 response, they should come back later to recrawl the site. If we didn't use a maintenance page and our application returned a 404 or 500 errors instead, it's possible that a search engine crawler might remove the page from their index. How to do it... Let's open up a terminal and navigate to one of our Heroku projects to begin with, using the following steps: We can view if our application's maintenance page is currently enabled with the maintenance command: $ heroku maintenanceoff Let's try turning it on. This will stop traffic from being routed to our dynos and show the maintenance page as follows: $ heroku maintenance:onEnabling maintenance mode for load-tester-rails... done Now, if we visit our application, we'll see the default Heroku maintenance page: To disable the maintenance page and resume sending users to our application, we can use the maintenance:off command: $ heroku maintenance:offDisabling maintenance mode for load-tester-rails... done Managing releases and rolling back What do we do if disaster strikes and our newly released code breaks our application? Luckily for us, Heroku keeps a copy of every deploy and configuration change to our application. This enables us to roll back to a previous version while we work to correct the errors in our latest release. Heads up! Rolling back only affects application code and configuration variables. Add-ons and our database will not be affected by a rollback. In this recipe, we will learn how to manage our releases and roll back code from the CLI. How to do it... In this recipe, we'll view and manage our releases from the Heroku CLI, using the releases command. Let's open up a terminal now and navigate to one of our Heroku projects by performing the following steps: Heroku tracks every deploy and configuration change as a release. We can view all of our releases from both the CLI and the web dashboard with the releases command: $ heroku releases=== load-tester-rails Releasesv33 Add WEB_CON config vars coutermarsh.mike@gmail.com 2014/03/30 11:18:49 (~ 5h ago)v32 Remove SEC config vars       coutermarsh.mike@gmail.com 2014/03/29 19:38:06 (~ 21h ago)v31 Remove VAR config vars     coutermarsh.mike@gmail.com 2014/03/29 19:38:05 (~ 21h ago)v30 Remove config vars       coutermarsh.mike@gmail.com 2014/03/29 19:27:05 (~ 21h ago)v29 Deploy 9218c1c vars coutermarsh.mike@gmail.com 2014/03/29 19:24:29 (~ 21h ago) Alternatively, we can view our releases through the Heroku dashboard. Visit https://dashboard.heroku.com, select one of our applications, and click on Activity: We can view detailed information about each release using the info command. This shows us everything about the change and state of the application during this release: $ heroku releases:info v33=== Release v33Addons: librato:development       newrelic:stark       rollbar:free       sendgrid:starterBy:     coutermarsh.mike@gmail.comChange: Add WEB_CONCURRENCY config varsWhen:   2014/03/30 11:18:49 (~ 6h ago)=== v33 Config VarsWEB_CONCURRENCY: 3 We can revert to the previous version of our application with the rollback command: $ heroku rollbackRolling back load-tester-rails... done, v32!   Warning: rollback affects code and config vars; it doesn't add or remove addons. To undo, run: heroku rollback v33 Rolling back creates a new version of our application in the release history. We can also specify a specific version to roll back to: $ heroku rollback v30Rolling back load-tester-rails... done, v30 The version we roll back to does not have to be an older version. Although it sounds contradictory, we can also roll back to newer versions of our application. How it works… Behind the scenes, each Heroku release is tied to a specific slug and set of configuration variables. As Heroku keeps a copy of each slug that we deploy, we're able to quickly roll back to previous versions of our code without having to rebuild our application. For each deploy release created, it will include a reference to the Git SHA that was pushed to master. The Git SHA is a reference to the last commit made to our repository before it was deployed. This is useful if we want to know exactly what code was pushed out in that release. On our local machine, we can run the $ git checkout git-sha-here command to view our application's code in the exact state it was when deployed. Running one-off tasks and dynos In more traditional hosting environments, developers will often log in to servers to perform basic administrative tasks or debug an issue. With Heroku, we can do this by launching one-off dynos. These are dynos that contain our application code but do not serve web requests. For a Ruby on Rails application, one-off dynos are often used to run database migrations or launch a Rails console. How to do it... In this recipe, we will learn how to execute commands on our Heroku applications with the heroku run command. Let's launch a terminal now to get started with the following steps: To have Heroku start a one-off dyno and execute any single command, we will use heroku run. Here, we can try it out by running a simple command to print some text to the screen: $ heroku run echo "hello heroku"Running `echo "hello heroku"` attached to terminal... up, run.7702"hello heroku" One-off dynos are automatically shut down after the command has finished running. We can see that Heroku is running this command on a dyno with our application's code. Let's run ls to see a listing of the files on the dyno. They should look familiar: $ heroku run lsRunning `ls` attached to terminal... up, run.5518app bin config config.ru db Gemfile Gemfile.lock lib log Procfile     public Rakefile README README.md tmp If we want to run multiple commands, we can start up a bash session. Type exit to close the session: $ heroku run bashRunning `bash` attached to terminal... up, run.2331~ $ lsapp bin config config.ru db Gemfile Gemfile.lock      lib log Procfile public Rakefile README README.md tmp~ $ echo "hello"hello~ $ exitexit We can run tasks in the background using the detached mode. The output of the command goes to our logs rather than the screen: $ heroku run:detached echo "hello heroku"Running `echo hello heroku` detached... up, run.4534Use `heroku logs -p run.4534` to view the output. If we need more power, we can adjust the size of the one-off dynos. This command will launch a bash session in a 2X dyno: $ heroku run --size=2X bash If we are running one-off dynos in the detached mode, we can view their status and stop them in the same way we would stop any other dyno: $ heroku ps=== run: one-off processesrun.5927 (1X): starting 2014/03/29 16:18:59 (~ 6s ago)$ heroku ps:stop run.5927 How it works… When we issue the heroku run command, Heroku spins up a new dyno with our latest slug and runs the command. Heroku does not start our application; the only command that runs is the command that we explicitly pass to it. One-off dynos act a little differently than standard dynos. If we create one dyno in the detached mode, it will run until we stop it manually, or it will shut down automatically after 24 hours. It will not restart like a normal dyno will. If we run bash from a one-off dyno, it will run until we close the connection or reach an hour of inactivity. Managing SSH keys Heroku manages access to our application's Git repository with SSH keys. When we first set up the Heroku Toolbelt, we had to upload either a new or existing public key to Heroku's servers. This key allows us to access our Heroku Git repositories without entering our password each time. If we ever want to deploy our Heroku applications from another computer, we'll either need to have the same key on that computer or provide Heroku with an additional one. It's easy enough to do this via the CLI, which we'll learn in this recipe. How to do it… To get started, let's fire up a terminal. We'll be using the keys command in this recipe by performing the following steps: First, let's view all of the existing keys in our Heroku account: $ heroku keys=== coutermarsh.mike@gmail.com Keysssh-rsa AAAAB3NzaC...46hEzt1Q== coutermarsh.mike@gmail.comssh-rsa AAAAB3NzaC...6EU7Qr3S/v coutermarsh.mike@gmail.comssh-rsa AAAAB3NzaC...bqCJkM4w== coutermarsh.mike@gmail.com To remove an existing key, we can use keys:remove. To the command, we need to pass a string that matches one of the keys: $ heroku keys:remove "7Qr3S/v coutermarsh.mike@gmail.com"Removing 7Qr3S/v coutermarsh.mike@gmail.com SSH key... done To add our current user's public key, we can use keys:add. This will look on our machine for a public key (~/.ssh/id_rsa.pub) and upload it: $ heroku keys:addFound existing public key: /Users/mike/.ssh/id_rsa.pubUploading SSH public key /Users/mike/.ssh/id_rsa.pub… done To create a new SSH key, we can run $ ssh-keygen -t rsa. If we'd like, we can also specify where the key is located if it is not in the default /.ssh/ directory: $ heroku keys:add /path/to/key.pub How it works… SSH keys are the standard method for password-less authentication. There are two parts to each SSH key. There is a private key, which stays on our machine and should never be shared, and there is a public key, which we can freely upload and share. Each key has its purpose. The public key is used to encrypt messages. The private key is used to decrypt messages. When we try to connect to our Git repositories, Heroku's server uses our public key to create an encrypted message that can only be decrypted by our private key. The server then sends the message to our machine; our machine's SSH client decrypts it and sends the response to the server. Sending the correct response successfully authenticates us. SSH keys are not used for authentication to the Heroku CLI. The CLI uses an authentication token that is stored in our ~/.netrc file. Sharing and collaboration We can invite collaborators through both the web dashboard and the CLI. In this recipe, we'll learn how to quickly invite collaborators through the CLI. How to do it… To start, let's open a terminal and navigate to the Heroku application that we would like to share, using the following steps: To see the current users who have access to our application, we can use the sharing command: $ heroku sharing=== load-tester-rails Access Listcoutermarsh.mike@gmail.com ownermike@form26.com             collaborator To invite a collaborator, we can use sharing:add: $ heroku sharing:add coutermarshmike@gmail.com Adding coutermarshmike@gmail.com to load-tester-rails as collaborator... done Heroku will send an e-mail to the user we're inviting, even if they do not already have a Heroku account. If we'd like to revoke access to our application, we can do so with sharing:remove:$ heroku sharing:remove coutermarshmike@gmail.comRemoving coutermarshmike@gmail.com from load-tester-rails collaborators... done How it works… When we add another collaborator to our Heroku application, they are granted the same abilities as us, except that they cannot manage paid add-ons or delete the application. Otherwise, they have full control to administrate the application. If they have an existing Heroku account, their SSH key will be immediately added to the application's Git repository. See also Interested in using multiple Heroku accounts on a single machine? Take a look at the Heroku-accounts plugin at https://github.com/ddollar/heroku-accounts. Monitoring load average and memory usage We can monitor the resource usage of our dynos from the command line using the log-runtime-metrics plugin. This will give us visibility into the CPU and memory usage of our dynos. With this data, we'll be able to determine if our dynos are correctly sized, detect problems earlier, and determine whether we need to scale our application. How to do it… Let's open up a terminal; we'll be completing this recipe with the CLI by performing the following steps: First, we'll need to install the log-runtime-metrics plugin via the CLI. We can do this easily through heroku labs: $ heroku labs:enable log-runtime-metrics Now that the runtime metrics plugin is installed, we'll need to restart our dynos for it to take effect: $ heroku restart Now that the plugin is installed and running, our dynos' resource usage will be printed to our logs. Let's view them now: $ heroku logsheroku[web.1]: source=web.1 dyno=heroku.21 sample#load_avg_1m=0.00 sample#load_avg_5m=0.00heroku[web.1]: source=web.1 dyno=heroku.21sample#memory_total=105.28MB sample#memory_rss=105.28MBsample#memory_cache=0.00MBsample#memory_swap=0.00MBsample#memory_pgpgin=31927pagessample#memory_pgpgout=4975pages From the logs, we can see that for this application, our load average is 0, and this dyno is using a total of 105 MB of RAM. How it works… Now that we have some insight into how our dynos are using resources, we need to learn how to interpret these numbers. Understanding the utilization of our dynos will be key for us if we ever need to diagnose a performance-related issue. In our logs, we will now see load_avg_1m and load_avg_5m. This is our dynos' load average over a 1-minute and 5-minute period. The timeframes are helpful in determining whether we're experiencing a brief spike in activity or it is more sustained. Load average is the amount of total computational work that the CPU has to complete. The 1X and 2X dynos have access to four virtual cores. A load average of four means that the dynos' CPU is fully utilized. Any value above four is a warning sign that the dyno might be overloaded, and response times could begin to suffer. Web applications are typically not CPU-intensive applications, seeing low load averages for web dynos should be expected. If we start seeing high load averages, we should consider either adding more dynos or using larger dynos to handle the load. Our memory usage is also shown in the logs. The key value that we want to keep track of is memory_rrs, which is the total amount of RAM being utilized by our application. It's best to keep this value no higher than 50 to 70 percent of the total RAM available on the dyno. For a 1X dyno with 512 MB of memory, this would mean keeping our memory usage no greater than 250 to 350 MB. This allows our application's room to grow under load and helps us avoid any memory swapping. Seeing values above 70 percent is an indication that we need to either adjust our application's memory usage or scale up. Memory swap occurs when our dyno runs out of RAM. To compensate, our dyno will begin using its hard drive to store data that will normally be stored in RAM. For any web application, any swap should be considered evil. This value should always be zero. If our dyno starts swapping, we can expect that it will significantly slow down our application's response times. Seeing any swap is an immediate indication that we must either reduce our application's memory consumption or start scaling. See also Load average and memory usage are particularly useful when performing application load tests. Summary In this article, we learned various commands on how to view application logs, installing add-ons, viewing application logs, enabling the maintenance page, managing SSH keys, sharing and collaboration, and so on. Resources for Article: Further resources on this subject: Securing vCloud Using the vCloud Networking and Security App Firewall [article] vCloud Networks [article] Apache CloudStack Architecture [article]
Read more
  • 0
  • 0
  • 8839

article-image-amazon-web-services
Packt
20 Nov 2014
16 min read
Save for later

Amazon Web Services

Packt
20 Nov 2014
16 min read
 In this article, by Prabhakaran Kuppusamy and Uchit Vyas, authors of AWS Development Essentials, you will learn different tools and methods available to perform the same operation with different, varying complexities. Various options are available, depending on the user's level of experience. In this article, we will start with an overview of each service, learn about the various tools available for programmer interaction, and finally see the troubleshooting and best practices to be followed while using these services. AWS provides a handful of services in every area. In this article, we will cover the following topics: Navigate through the AWS Management Console Describe the security measures that AWS provides AWS interaction through the SDK and IDE tools (For more resources related to this topic, see here.) Background of AWS and its needs AWS is based on an idea presented by Chris Pinkham and Benjamin Black with a vision towards Amazon's retail computing infrastructure. The first Amazon offering was SQS, in the year 2004. Officially, AWS was launched and made available online in 2006, and within a year, 200,000 developers signed up for these services. Later, due to a natural disaster (June 29, 2012 storm in North Virginia, which brought down most of the servers residing at this location) and technical events, AWS faced a lot of challenges. A similar event happened on December 2012, after which AWS has been providing services as stated. AWS learned from these events and made sure that the same kind of outage didn't occur even if the same event occurred again. AWS is an idea born in a single room, but the idea is now made available and used by almost all the cloud developers and IT giants. AWS is greatly loved by all kinds of technology admirers. Irrespective of the user's expertise, AWS has something for various types of users. For an expert programmer, AWS has SDKs for each service. Using these SDKs, the programmer can perform operations by entering commands in the command-line interface. However an end user with limited knowledge of programming can still perform similar operations using the graphical user interface of the AWS Management Console, which is accessible through a web browser. If the programmers need interactions between a low-level (SDK) and a high-level (Management Console), they can go for the integrated development environment (IDE) tools, for which AWS provides plugins and add-ons. One such commonly used IDE for which AWS has provided add-ons is the Eclipse IDE. As of now, we will start with the AWS Management Console. The AWS Management Console The most popular method of accessing AWS is via the Management Console because of its simplicity of usage and power. Another reason why the end user prefers the Management Console is that it doesn't require any software to start with; having an Internet connection and a browser is sufficient. As the name suggests, the Management Console is a place where administrative and advanced operations can be performed on your AWS account details or AWS services. The Management Console mainly focuses on the following features: One-click access to AWS's services AWS account administration AWS management using handheld devices AWS infrastructure management across the globe One-click access to the AWS services To access the Management Console, all you need to do is first sign up with AWS. Once done, the Management Console will be available at https://console.aws.amazon.com/. Once you have signed up, you will be directed to the following page: Each and every icon on this page is an Amazon Web Service. Two or more services will be grouped under a category. For example, in the Analytics category, you can see three services, namely, Data Pipeline, Elastic MapReduce, and Kinesis. Starting with any of these services is very easy. Have a look at the description of the service at the bottom of the service icon. As soon as you click on the service icon, it will take you to the Getting started page of the corresponding service, where brief as well as detailed guidelines are available. In order to start with any of the services, only two things are required. The first one is an AWS account and the second one is the supported browser. The Getting started section usually will have a video, which explains the specialty and use cases of the service that you selected. Once you finish reading the Getting started section, optionally you can go through the DOC files specific to the service to know more about the syntaxes and usage of the service operations. AWS account administration The account administration is one of the most important things to make note of. To do this, click on your displayed name (in this case, Prabhakar) at the top of the page, and then click on the My Account option, as shown in the preceding screenshot. At the beginning of every month, you don't want AWS to deduct all your salary by stating that you have used these many services costing this much money; hence, all this management information is available in the Management Console. Using the Management Console, you can infer the following information: The monthly billing in brief as well as the detailed manner (cost split-up of each service) along with a provision to view VAT and tax exemption Account details, such as the display name and contact information Provision to close the AWS account All the preceding operations and much more are possible. AWS management using handheld devices Managing and accessing the AWS services is through (but not limited to) PC. AWS provides a handful of applications almost for all or most of the mobile platforms, such as Android, iOS, and so on. Using these applications, you can perform all the AWS operations on the move. You won't believe that having a 7-inch Android tablet with the installed AWS Console application from Google Play will enable you to ask for any Elastic Compute Cloud (EC2) instance from Amazon and control it (start, stop, and terminate) very easily. You can install an SSH client in the tablet and connect to the Linux terminal. However, if you wish to make use of the Windows instance from EC2, you might use the Graphics User Interface (GUI) more frequently than a command line. A few more sophisticated software and hardware might be needed, for example, you should have a VNC viewer or remote desktop connection software to get the GUI of the EC2 instance borrowed. As you are making use of the GUI in addition to the keyboard, you will need a pointer device, such as a mouse. As a result, you will almost get addicted to the concept of cloud computing going mobile. AWS infrastructure management across the globe At this point, you might be aware that you can get all of these AWS services from servers residing at any of the following locations. To control these services used by you in different regions, you don't have to go anywhere else. You can control it right here in the same Management Console. Using the same Management Console, just by clicking on N.Virginia and choosing the location (at the top of the Management Console), you can make the service available in that region, as shown in the following screenshot: You can choose the server location at which you want the service (data and machine) to be made available based on the following two factors: The first factor is the distance between the server's location and the client's location. For example, if you have deployed a web application for a client from North California at a Tokyo location, obviously the latency will be high while accessing the application. Therefore, choosing the optimum service location is the primary factor. The second factor is the charge for the service in a specific location. AWS charges more for certain crowded servers. Just for illustration, assume that the server for North California is used by many critical companies. So this might cost you twice if you create your servers at North California compared to the other locations. Hence, you should always consider the tradeoff between the location and cost and then decide on the server location. Whenever you click on any of the services, AWS will always select the location that costs you less money as the default. AWS security measures Whenever you think of moving your data center to a public cloud, the first question that arises in your mind is about data security. In a public cloud, through virtualization technology, multiple users might be using the same hardware (server) in which your data is available. You will learn in detail about how AWS ensures data security. Instance isolation Before learning about instance isolation, you must know how AWS EC2 provisions the instances to the user. This service allows you to rent virtual machines (AWS calls it instances) with whatever configurations you ask. Let's assume that you requested AWS to provision a 2 GB RAM, a 100 GB HDD, and an Ubuntu instance. Within a minute, you will be given the instance's connection details (public DNS, private IP, and so on), and the instance starts running. Does this mean that AWS assembled a 2*1 GB RAM and 100 GB HDD into a CPU cabinet and then installed Ubuntu OS in it and gave you the access? The answer is no. The provisioned instance is not a single PC (or bare metal) with an OS installed in it. The instance is the outcome of a virtual machine provisioned by Amazon's private cloud. The following diagram shows how a virtual machine can be provisioned by a private cloud: Let's examine the diagram from bottom to top. First, we will start with the underlying Hardware/Host. Hardware is the server, which usually has a very high specification. Here, assume that your hardware has the configuration of a 99 GB RAM, a 450 TB HDD, and a few other elements, such as NIC, which you need not consider now. The next component in your sights is the Hypervisor. A hypervisor or virtual machine monitor (VMM) is used to create and run virtual machines on the hardware. In private cloud terms, whichever machine runs a hypervisor on it is called the host machine. Three users can request each of them need instances with a 33 GB RAM and 150 TB HDD space. This request goes to the hypervisor and it then starts creating those VMs. After creating the VMs, a notification about the connection parameters will be sent to each user. In the preceding diagram, you can see the three virtual machines (VMs) created by the hypervisor. All the three VMs are running on different operating systems. Even if all the three virtual machines are used by different users, each will feel that only he/she has access to the single piece of hardware, which is only used by them; user 1 might not know that the same hardware is also being used by user 2, and so on. The process of creating a virtual version of a machine or storage or network is called virtualization. The funny part is that none of the virtual machines knows that it is being virtualized (that is, all the VMs are created on the same host). After getting this information about your instances, some users may feel deceived, and some will be even disappointed and cry out loud, has your instance been created on a shared disc or resource? Even though the disc (or hardware) is shared, one instance (or owner of the instance) is isolated from the other instances on the same disc through a firewall. This concept is termed as instance isolation. The following diagram demonstrates instance isolation in AWS: The preceding diagram clearly demonstrates how EC2 provides instances to every user. Even though all the instances are lying in the same disc, they are isolated by hypervisor. Hypervisor has a firewall that does this isolation. So, the physical interface will not interact with the underlying hardware (machine or disc where instances are available) or virtual interface directly. All these interactions will be through hypervisor's firewall. This way AWS ensures that no user can directly access the disc, and no instance can directly interact with another instance even if both instances are running on the same hardware. In addition to the firewall, during the creation of the EC2 instance, the user can specify the permitted and denied security groups of the instance. These two ideologies provide instance isolation. In the preceding diagram, Customer 1, Customer 2, and so on are virtualized discs since the customer instances have no access to raw or actual disc devices. As an added security measure, the user can encrypt his/her disc so that other users cannot access the disc content (even if someone gets in contact with the disc). Isolated GovCloud Similar to North California or Asia Pacific, GovCloud is also a location where you can get your AWS services. This location is specifically designed only for government and agencies whose data is very confidential and valuable, and disclosing this data might result in disaster. By default, this location will not be available to the user. If you want access to this location, then you need to raise a compliance request at http://aws.amazon.com/compliance/contact/ submit the FedRAMP Package Request Form downloadable at http://cloud.cio.gov/document/fedramp-package-request-form. From these two URLs, you can understand how secured the cloud location really is. CloudTrail CloudTrail is an AWS service that performs the user activity and changes tracking. Enabling CloudTrail will log all the API request information into your S3 bucket, which you have created solely for this purpose. CloudTrail also allows you to create an SNS topic as soon as a new logfile is created by CloudTrail. CloudTrail, in hand with SNS, provides real-time user activity as messages to the user. Password This might sound funny. After looking at CloudTrail, if you feel that someone else is accessing your account, the best option is to change the password. Never let anyone look at your password, as this could easily comprise an entire account. Sharing the password is like leaving your treasury door open. Multi-Factor Authentication Until now, to access AWS through a browser, you had to log in at http://aws.amazon.com and enter your username and password. However, enabling Multi-Factor Authentication (MFA) will add another layer of security and ask you to provide an authentication code sent to the device configured with this account. In the security credential page at https://console.aws.amazon.com/iam/home?#security_credential, there is a provision to enable MFA. Clicking on Enable will display the following window: Selecting the first option A virtual MFA device will not cost you money, but this requires a smartphone (with an Android OS), and you need to download an app from the App Store. After this, during every login, you need to look at your smartphone and enter the authentication token. More information is available at https://youtu.be/MWJtuthUs0w. Access Keys (Access Key ID and Secret Access Key) In the same security credentials page, next to MFA, these access keys will be made available. AWS will not allow you to have more than two access keys. However, you can delete and create as many access keys as possible, as shown in the following screenshot: This access key ID is used while accessing the service via the API and SDK. During this time, you must provide this ID. Otherwise, you won't be able to perform any operation. To put it in other words, if someone else gets or knows this ID, they could pretend to be you through the SDK and API. In the preceding screenshot, the first key is inactive and the second key is active. The Create New Access Key button is disabled because I already have a maximum number of allowed access keys. As an added measure, I forged my actual IDs. It is a very good practice to delete a key and create a new key every month using the Delete command link and toggle the active keys every week (by making it active and inactive) by clicking on the Make Active or Make Inactive command links. Never let anyone see these IDs. If you are ever in doubt, delete the ID and create a new one. Clicking on Create New Access Key button (assuming that you have less than two IDs) will display the following window, asking you to download the new access key ID as a CSV file: The CloudFront key pairs The CloudFront key pairs are very similar to the access-key IDs. Without these keys, you will not be able to perform any operation on CloudFront. Unlike the access key ID (which has only access key ID and secret access key), here you will have a private key and a public key along with the access key ID, as shown in the following screenshot: If you lose these keys once, then you need to delete the key pair and create a new key pair. This is also an added security measure. X.509 certificates X.509 certificates are mandatory if you wish to make any SOAP requests on any AWS service. Clicking on Create new certificate will display the following window, which performs exactly the same function: Account identifiers There are two IDs that are used to identify ourselves when accessing the service via the API or SDK. These are the AWS account ID and the canonical user ID. These two IDs are unique. Just as with the preceding parameters, never share these IDs or let anyone see them. If someone has your access ID or key pair, the best option is generate a new one. But it is not possible to generate a new account ID or canonical user ID. Summary In this article, you learned the AWS Management Console and its commonly used SDKs and IDEs. You also learned how AWS secures your data. Then, you looked at the AWS plugin configuration on the Eclipse IDE. The first part made the user familiar with the AWS Management Console. After that, you explored a few of the important security aspects of AWS and learned how AWS handles it. Finally, you learned about the different AWS tools available to the programmer to make his development work easier. In the end, you examined the common SDKs and IDE tools of AWS. Resources for Article: Further resources on this subject: Amazon DynamoDB - Modelling relationships, Error handling [article] A New Way to Scale [article] Deployment and Post Deployment [article]
Read more
  • 0
  • 0
  • 3475

article-image-creating-java-ee-applications
Packt
24 Oct 2014
16 min read
Save for later

Creating Java EE Applications

Packt
24 Oct 2014
16 min read
In this article by Grant Shipley author of Learning OpenShift we are going to learn how to use OpenShift in order to create and deploy Java-EE-based applications using the JBoss Enterprise Application Platform (EAP) application server. To illustrate and learn the concepts of Java EE, we are going to create an application that displays an interactive map that contains all of the major league baseball parks in the United States. We will start by covering some background information on the Java EE framework and then introduce each part of the sample application. The process for learning how to create the sample application, named mlbparks, will be started by creating the JBoss EAP container, then adding a database, creating the web services, and lastly, creating the responsive map UI. (For more resources related to this topic, see here.) Evolution of Java EE I can't think of a single programming language other than Java that has so many fans while at the same time has a large community of developers that profess their hatred towards it. The bad reputation that Java has can largely be attributed to early promises made by the community when the language was first released and then not being able to fulfill these promises. Developers were told that we would be able to write once and run anywhere, but we quickly found out that this meant that we could write once and then debug on every platform. Java was also perceived to consume more memory than required and was accused of being overly verbose by relying heavily on XML configuration files. Another problem the language had was not being able to focus on and excel at one particular task. We used Java to create thick client applications, applets that could be downloaded via a web browser, embedded applications, web applications, and so on. Having Java available as a tool that completes most projects was a great thing, but the implementation for each project was often confusing. For example, let's examine the history of the GUI development using the Java programming language. When the language was first introduced, it included an API called the Abstract Window Toolkit (AWT) that was essentially a Java wrapper around native UI components supplied by the operating system. When Java 1.2 was released, the AWT implementation was deprecated in the favor of the Swing API that contained GUI elements written in 100 percent Java. By this time, a lot of developers were quickly growing frustrated with the available APIs and a new toolkit called the Standard Widget Toolkit (SWT) was developed to create another UI toolkit for Java. SWT was developed at IBM and is the windowing toolkit in use by the Eclipse IDE and is considered by most to be the superior toolkit that can be used when creating applications. As you can see, rapid changes in the core functionality of the language coupled with the refusal of some vendors to ship the JRE as part of the operating system left a bad taste in most developers' mouths. Another reason why developers began switching from Java to more attractive programming languages was the implementation of Enterprise JavaBeans (EJB). The first Java EE release occurred in December, 1999, and the Java community is just now beginning to recover from the complexity introduced by the language in order to create applications. If you were able to escape creating applications using early EJBs, consider yourself one of the lucky ones, as many of your fellow developers were consumed by implementing large-scale systems using this new technology. It wasn't fun; trust me. I was there and experienced it firsthand. When developers began abandoning Java EE, they seemed to go in one of two directions. Developers who understood that the Java language itself was quite beautiful and useful adopted the Spring Framework methodology of having enterprise grade features while sticking with a Plain Old Java Object (POJO) implementation. Other developers were wooed away by languages that were considered more modern, such as Ruby and the popular Rails framework. While the rise in popularity of both Ruby and Spring was happening, the team behind Java EE continued to improve and innovate, which resulted in the creation of a new implementation that is both easy to use and develop with. I am happy to report that if you haven't taken a look at Java EE in the last few years, now is the time to do so. Working with the language after a long hiatus has been a rewarding and pleasurable experience. Introducing the sample application For the remainder of this article, we are going to develop an application called mlbparks that displays a map of the United States with a pin on the map representing the location of each major league baseball stadium. The requirements for the application are as follows: A single map that a user can zoom in and out of As the user moves the map around, the map must be updated with all baseball stadiums that are located in the shown area The location of the stadiums must be searchable based on map coordinates that are passed to the REST-based API The data should be transferred in the JSON format The web application must be responsive so that it is displayed correctly regardless of the resolution of the browser When a stadium is listed on the map, the user should be able to click on the stadium to view details about the associated team The end state application will look like the following screenshot: The user will also be able to zoom in on a specific location by double-clicking on the map or by clicking on the + zoom button in the top-left corner of the application. For example, if a user zooms the map in to the Phoenix, Arizona area of the United States, they will be able to see the information for the Arizona Diamondbacks stadium as shown in the following screenshot: To view this sample application running live, open your browser and type http://mlbparks-packt.rhcloud.com. Now that we have our requirements and know what the end result should look like, let's start creating our application. Creating a JBoss EAP application For the sample application that we are going to develop as part of this article, we are going to take advantage of the JBoss EAP application server that is available on the OpenShift platform. The JBoss EAP application server is a fully tested, stable, and supported platform for deploying mission-critical applications. Some developers prefer to use the open source community application server from JBoss called WildFly. Keep in mind when choosing WildFly over EAP that it only comes with community-based support and is a bleeding edge application server. To get started with building the mlbparks application, the first thing we need to do is create a gear that contains the cartridge for our JBoss EAP runtime. For this, we are going to use the RHC tools. Open up your terminal application and enter in the following command: $ rhc app create mlbparks jbosseap-6 Once the previous command is executed, you should see the following output: Application Options ------------------- Domain:     yourDomainName Cartridges: jbosseap-6 (addtl. costs may apply) Gear Size: default Scaling:   no Creating application 'mlbparks' ... done Waiting for your DNS name to be available ... done Cloning into 'mlbparks'... Your application 'mlbparks' is now available. URL:       http://mlbparks-yourDomainName.rhcloud.com/ SSH to:     5311180f500446f54a0003bb@mlbparks-yourDomainName.rhcloud.com Git remote: ssh://5311180f500446f54a0003bb@mlbparks-yourDomainName.rhcloud.com/~/git/mlbparks.git/ Cloned to: /home/gshipley/code/mlbparks  Run 'rhc show-app mlbparks' for more details about your app. If you have a paid subscription to OpenShift Online, you might want to consider using a medium- or large-size gear to host your Java-EE-based applications. To create this application using a medium-size gear, use the following command: $ rhc app create mlbparks jbosseap-6 -g medium Adding database support to the application Now that our application gear has been created, the next thing we want to do is embed a database cartridge that will hold the information about the baseball stadiums we want to track. Given that we are going to develop an application that doesn't require referential integrity but provides a REST-based API that will return JSON, it makes sense to use MongoDB as our database. MongoDB is arguably the most popular NoSQL database available today. The company behind the database, MongoDB, offers paid subscriptions and support plans for production deployments. For more information on this popular NoSQL database, visit www.mongodb.com. Run the following command to embed a database into our existing mlbparks OpenShift gear: $ rhc cartridge add mongodb-2.4 -a mlbparks Once the preceding command is executed and the database has been added to your application, you will see the following information on the screen that contains the username and password for the database: Adding mongodb-2.4 to application 'mlbparks' ... done  mongodb-2.4 (MongoDB 2.4) ------------------------- Gears:         Located with jbosseap-6 Connection URL: mongodb://$OPENSHIFT_MONGODB_DB_HOST:$OPENSHIFT_MONGODB_DB_PORT/ Database Name: mlbparks Password:       q_6eZ22-fraN Username:       admin MongoDB 2.4 database added. Please make note of these credentials:    Root User:     admin    Root Password: yourPassword    Database Name: mlbparks Connection URL: mongodb://$OPENSHIFT_MONGODB_DB_HOST:$OPENSHIFT_MONGODB_DB_PORT/ Importing the MLB stadiums into the database Now that we have our application gear created and our database added, we need to populate the database with the information about the stadiums that we are going to place on the map. The data is provided as a JSON document and contains the following information: The name of the baseball team The total payroll for the team The location of the stadium represented with the longitude and latitude The name of the stadium The name of the city where the stadium is located The league the baseball club belongs to (national or American) The year the data is relevant for All of the players on the roster including their position and salary A sample for the Arizona Diamondbacks looks like the following line of code: {   "name":"Diamondbacks",   "payroll":89000000,   "coordinates":[     -112.066662,     33.444799   ], "ballpark":"Chase Field",   "city":"Phoenix",   "league":"National League", "year":"2013",   "players":[     {       "name":"Miguel Montero", "position":"Catcher",       "salary":10000000     }, ………… ]} In order to import the preceding data, we are going to use the SSH command. To get started with the import, SSH into your OpenShift gear for the mlbparks application by issuing the following command in your terminal prompt: $ rhc app ssh mlbparks Once we are connected to the remote gear, we need to download the JSON file and store it in the /tmp directory of our gear. To complete these steps, use the following commands on your remote gear: $ cd /tmp $ wget https://raw.github.com/gshipley/mlbparks/master/mlbparks.json Wget is a software package that is available on most Linux-based operating systems in order to retrieve files using HTTP, HTTPS, or FTP. Once the file has completed downloading, take a quick look at the contents using your favorite text editor in order to get familiar with the structure of the document. When you are comfortable with the data that we are going to import into the database, execute the following command on the remote gear to populate MongoDB with the JSON documents: $ mongoimport --jsonArray -d $OPENSHIFT_APP_NAME -c teams --type json --file /tmp/mlbparks.json -h $OPENSHIFT_MONGODB_DB_HOST --port $OPENSHIFT_MONGODB_DB_PORT -u $OPENSHIFT_MONGODB_DB_USERNAME -p $OPENSHIFT_MONGODB_DB_PASSWORD If the command was executed successfully, you should see the following output on the screen: connected to: 127.7.150.130:27017 Fri Feb 28 20:57:24.125 check 9 30 Fri Feb 28 20:57:24.126 imported 30 objects What just happened? To understand this, we need to break the command we issued into smaller chunks, as detailed in the following table: Command/argument Description mongoimport This command is provided by MongoDB to allow users to import data into a database. --jsonArray This specifies that we are going to import an array of JSON documents. -d $OPENSHIFT_APP_NAME Specifies the database that we are going to import the data into the database. We are using a system environment variable to use the database that was created by default when we embedded the database cartridge in our application. -c teams This defines the collection to which we want to import the data. If the collection does not exist, it will be created. --type json This specifies the type of file we are going to import. --file /tmp/mlbparks.json This specifies the full path and name of the file that we are going to import into the database. -h $OPENSHIFT_MONGODB_DB_HOST This specifies the host of the MongoDB server. --port $OPENSHIFT_MONGODB_DB_PORT This specifies the port of the MongoDB server. -u $OPENSHIFT_MONGODB_DB_USERNAME This specifies the username to be used to be authenticated to the database. -p $OPENSHIFT_MONGODB_DB_PASSWORD This specifies the password to be authenticated to the database. To verify that data was loaded properly, you can use the following command that will print out the number of documents in the teams collections of the mlbparks database: $ mongo -quiet $OPENSHIFT_MONGODB_DB_HOST:$OPENSHIFT_MONGODB_DB_PORT/$OPENSHIFT_APP_NAME -u $OPENSHIFT_MONGODB_DB_USERNAME -p $OPENSHIFT_MONGODB_DB_PASSWORD --eval "db.teams.count()" The result should be 30. Lastly, we need to create a 2d index on the teams collection to ensure that we can perform spatial queries on the data. Geospatial queries are what allow us to search for specific documents that fall within a given location as provided by the latitude and longitude parameters. To add the 2d index to the teams collections, enter the following command on the remote gear: $ mongo $OPENSHIFT_MONGODB_DB_HOST:$OPENSHIFT_MONGODB_DB_PORT/$OPENSHIFT_APP_NAME --eval 'db.teams.ensureIndex( { coordinates : "2d" } );' Adding database support to our Java application The next step in creating the mlbparks application is adding the MongoDB driver dependency to our application. OpenShift Online supports the popular Apache Maven build system as the default way of compiling the source code and resolving dependencies. Maven was originally created to simplify the build process by allowing developers to specify specific JARs that their application depends on. This alleviates the bad practice of checking JAR files into the source code repository and allows a way to share JARs across several projects. This is accomplished via a pom.xml file that contains configuration items and dependency information for the project. In order to add the dependency for the MongoDB client to our mlbparks applications, we need to modify the pom.xml file that is in the root directory of the Git repository. The Git repository was cloned to our local machine during the application's creation step that we performed earlier in this article. Open up your favorite text editor and modify the pom.xml file to include the following lines of code in the <dependencies> block: <dependency> <groupId>org.mongodb</groupId> <artifactId>mongo-java-driver</artifactId> <version>2.9.1</version> </dependency> Once you have added the dependency, commit the changes to your local repository by using the following command: $ git commit -am "added MongoDB dependency" Finally, let's push the change to our Java application to include the MongoDB database drivers using the git push command: $ git push The first time the Maven build system builds the application, it downloads all the dependencies for the application and then caches them. Because of this, the first build will always that a bit longer than any subsequent build. Creating the database access class At this point, we have our application created, the MongoDB database embedded, all the information for the baseball stadiums imported, and the dependency for our database driver added to our application. The next step is to do some actual coding by creating a Java class that will act as the interface for connecting to and communicating with the MongoDB database. Create a Java file named DBConnection.java in the mlbparks/src/main/java/org/openshift/mlbparks/mongo directory and add the following source code: package org.openshift.mlbparks.mongo;  import java.net.UnknownHostException; import javax.annotation.PostConstruct; import javax.enterprise.context.ApplicationScoped; import javax.inject.Named; import com.mongodb.DB; import com.mongodb.Mongo;  @Named @ApplicationScoped public class DBConnection { private DB mongoDB; public DBConnection() {    super(); } @PostConstruct public void afterCreate() {    String mongoHost = System.getenv("OPENSHIFT_MONGODB_DB_HOST");    String mongoPort = System.getenv("OPENSHIFT_MONGODB_DB_PORT");    String mongoUser = System.getenv("OPENSHIFT_MONGODB_DB_USERNAME");    String mongoPassword = System.getenv("OPENSHIFT_MONGODB_DB_PASSWORD");    String mongoDBName = System.getenv("OPENSHIFT_APP_NAME");    int port = Integer.decode(mongoPort);    Mongo mongo = null;  try {     mongo = new Mongo(mongoHost, port);    } catch (UnknownHostException e) {     System.out.println("Couldn't connect to MongoDB: " + e.getMessage() + " :: " + e.getClass());    }    mongoDB = mongo.getDB(mongoDBName);    if (mongoDB.authenticate(mongoUser, mongoPassword.toCharArray()) == false) {     System.out.println("Failed to authenticate DB ");    } } public DB getDB() {    return mongoDB; } } The preceding source code as well as all source code for this article is available on GitHub at https://github.com/gshipley/mlbparks. The preceding code snippet simply creates an application-scoped bean that is available until the application is shut down. The @ApplicationScoped annotation is used when creating application-wide data or constants that should be available to all the users of the application. We chose this scope because we want to maintain a single connection class for the database that is shared among all requests. The next bit of interesting code is the afterCreate method that gets authenticated on the database using the system environment variables. Once you have created the DBConnection.java file and added the preceding source code, add the file to your local repository and commit the changes as follows: $ git add . $ git commit -am "Adding database connection class" Creating the beans.xml file The DBConnection class we just created makes use of Context Dependency Injection (CDI), which is part of the Java EE specification, for dependency injection. According to the official specification for CDI, an application that uses CDI must have a file called beans.xml. The file must be present and located under the WEB-INF directory. Given this requirement, create a file named beans.xml under the mlbparks/src/main/webapp/WEB-INF directory and add the following lines of code: <?xml version="1.0"?> <beans xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://jboss.org/schema/cdi/beans_1_0.xsd"/> After you have added the beans.xml file, add and commit it to your local Git repository: $ git add . $ git commit -am "Adding beans.xml for CDI" Summary In this article we learned about the evolution of Java EE, created a JBoss EAP application, and created the database access class. Resources for Article: Further resources on this subject: Using OpenShift [Article] Common performance issues [Article] The Business Layer (Java EE 7 First Look) [Article]
Read more
  • 0
  • 0
  • 1503

article-image-web-api-and-client-integration
Packt
10 Oct 2014
9 min read
Save for later

Web API and Client Integration

Packt
10 Oct 2014
9 min read
In this article written by Geoff Webber-Cross, the author of Learning Microsoft Azure, we'll create an on-premise production management client Windows application allowing manufacturing staff to view and update order and batch data and a web service to access data in the production SQL database and send order updates to the Service Bus topic. (For more resources related to this topic, see here.) The site's main feature is an ASP.NET Web API 2 HTTP service that allows the clients to read order and batch data. The site will also host a SignalR (http://signalr.net/) hub that allows the client to update order and batch statuses and have the changes broadcast to all the on-premise clients to keep them synchronized in real time. Both the Web API and SignalR hubs will use the Azure Active Directory authentication. We'll cover the following topic in this article: Building a client application Building a client application For the client application, we'll create a WPF client application to display batches and orders and allow us to change their state. We'll use MVVM Light again, like we did for the message simulator we created in the sales solution, to help us implement a neat MVVM pattern. We'll create a number of data services to get data from the API using Azure AD authentication. Preparing the WPF project We'll create a WPF application and install NuGet packages for MVVM Light, JSON.NET, and Azure AD authentication in the following procedure (for the Express version of Visual Studio, you'll need Visual Studio Express for desktops): Add a WPF project to the solution called ManagementApplication. In the NuGet Package Manager Console, enter the following command to install MVVM Light: install-package mvvmlight Now, enter the following command to install the Microsoft.IdentityModel.Clients.ActiveDirectory package: install-package Microsoft.IdentityModel.Clients.ActiveDirectory Now, enter the following command to install JSON.NET: install-package newtonsoft.json Enter the following command to install the SignalR client package (note that this is different from the server package): Install-package Microsoft.AspNet.SignalR.Client Add a project reference to ProductionModel by right-clicking on the References folder and selecting Add Reference, check ProductionModel by navigating to the Solution | Projects tab, and click on OK. Add a project reference to System.Configuraton and System.Net.Http by right-clicking on the References folder and selecting Add Reference, check System.Config and System.Net.Http navigating to the Assemblies | Framework tab, and click on OK. In the project's Settings.settings file, add a string setting called Token to store the user's auth token. Add the following appSettings block to App.config; I've put comments to help you understand (and remember) what they stand for and added commented-out settings for the Azure API: <appSettings> <!-- AD Tenant --> <add key="ida:Tenant" value="azurebakery.onmicrosoft.com" />    <!-- The target api AD application APP ID (get it from    config tab in portal) --> <!-- Local --> <add key="ida:Audience"    value="https://azurebakery.onmicrosoft.com/ManagementWebApi" /> <!-- Azure --> <!-- <add key="ida:Audience"    value="https://azurebakery.onmicrosoft.com/      WebApp-azurebakeryproduction.azurewebsites.net" /> -->    <!-- The client id of THIS application (get it from    config tab in portal) --> <add key="ida:ClientID" value=    "1a1867d4-9972-45bb-a9b8-486f03ad77e9" />    <!-- Callback URI for OAuth workflow --> <add key="ida:CallbackUri"    value="https://azurebakery.com" />    <!-- The URI of the Web API --> <!-- Local --> <add key="serviceUri" value="https://localhost:44303/" /> <!-- Azure --> <!-- <add key="serviceUri" value="https://azurebakeryproduction.azurewebsites.net/" /> --> </appSettings> Add the MVVM Light ViewModelLocator to Application.Resources in App.xaml: <Application.Resources>    <vm:ViewModelLocator x_Key="Locator"      d_IsDataSource="True"                      DataContext="{Binding Source={StaticResource          Locator}, Path=Main}"        Title="Production Management Application"          Height="350" Width="525"> Creating an authentication base class Since the Web API and SignalR hubs use Azure AD authentication, we'll create services to interact with both and create a common base class to ensure that all requests are authenticated. This class uses the AuthenticationContext.AquireToken method to launch a built-in login dialog that handles the OAuth2 workflow and returns an authentication token on successful login: using Microsoft.IdentityModel.Clients.ActiveDirectory; using System; using System.Configuration; using System.Diagnostics; using System.Net;   namespace AzureBakery.Production.ManagementApplication.Services {    public abstract class AzureAdAuthBase    {        protected AuthenticationResult Token = null;          protected readonly string ServiceUri = null;          protected AzureAdAuthBase()        {            this.ServiceUri =              ConfigurationManager.AppSettings["serviceUri"]; #if DEBUG            // This will accept temp SSL certificates            ServicePointManager.ServerCertificateValidationCallback += (se, cert, chain, sslerror) => true; #endif        }          protected bool Login()        {            // Our AD Tenant domain name            var tenantId =              ConfigurationManager.AppSettings["ida:Tenant"];              // Web API resource ID (The resource we want to use)            var resourceId =              ConfigurationManager.AppSettings["ida:Audience"];              // Client App CLIENT ID (The ID of the AD app for this            client application)            var clientId =              ConfigurationManager.AppSettings["ida:ClientID"];              // Callback URI            var callback = new              Uri(ConfigurationManager.AppSettings["ida:CallbackUri"]);              var authContext = new              AuthenticationContext(string.Format("https://login.windows.net/{0}", tenantId));              if(this.Token == null)            {                // See if we have a cached token               var token = Properties.Settings.Default.Token;                if (!string.IsNullOrWhiteSpace(token))                    this.Token = AuthenticationResult.Deserialize(token);            }                       if (this.Token == null)             {                try                {                    // Acquire fresh token - this will get user to                    login                                  this.Token =                      authContext.AcquireToken(resourceId,                         clientId, callback);                }                catch(Exception ex)                {                    Debug.WriteLine(ex.ToString());                      return false;                }            }            else if(this.Token.ExpiresOn < DateTime.UtcNow)            {                // Refresh existing token this will not require                login                this.Token =                  authContext.AcquireTokenByRefreshToken(this.Token.RefreshToken,                   clientId);            }                      if (this.Token != null && this.Token.ExpiresOn >              DateTime.UtcNow)            {                // Store token                Properties.Settings.Default.Token =                  this.Token.Serialize(); // This should be                    encrypted                Properties.Settings.Default.Save();                  return true;            }              // Clear token            this.Token = null;              Properties.Settings.Default.Token = null;            Properties.Settings.Default.Save();              return false;        }    } } The token is stored in user settings and refreshed if necessary, so the users don't have to log in to the application every time they use it. The Login method can be called by derived service classes every time a service is called to check whether the user is logged in and whether there is a valid token to use. Creating a data service We'll create a DataService class that derives from the AzureAdAuthBase class we just created and gets data from the Web API service using AD authentication. First, we'll create a generic helper method that calls an API GET action using the HttpClient class with the authentication token added to the Authorization header, and deserializes the returned JSON object into a .NET-typed object T: private async Task<T> GetData<T>(string action) {    if (!base.Login())        return default(T);      // Call Web API    var authHeader = this.Token.CreateAuthorizationHeader();    var client = new HttpClient();    var uri = string.Format("{0}{1}", this.ServiceUri,      string.Format("api/{0}", action));    var request = new HttpRequestMessage(HttpMethod.Get, uri);    request.Headers.TryAddWithoutValidation("Authorization",      authHeader);      // Get response    var response = await client.SendAsync(request);    var responseString = await response.Content.ReadAsStringAsync();      // Deserialize JSON    var data = await Task.Factory.StartNew(() =>      JsonConvert.DeserializeObject<T>(responseString));      return data; } Once we have this, we can quickly create methods for getting order and batch data like this:   public async Task<IEnumerable<Order>> GetOrders() {    return await this.GetData<IEnumerable<Order>>("orders"); }   public async Task<IEnumerable<Batch>> GetBatches() {    return await this.GetData<IEnumerable<Batch>>("batches"); } This service implements an IDataService interface and is registered in the ViewModelLocator class, ready to be injected into our view models like this: SimpleIoc.Default.Register<IDataService, DataService>(); Creating a SignalR service We'll create another service derived from the AzureAdAuthBase class, which is called ManagementService, and which sends updated orders to the SignalR hub and receives updates from the hub originating from other clients to keep the UI updated in real time. First, we'll create a Register method, which creates a hub proxy using our authorization token from the base class, registers for updates from the hub, and starts the connection: private IHubProxy _proxy = null;   public event EventHandler<Order> OrderUpdated; public event EventHandler<Batch> BatchUpdated;   public ManagementService() {   }   public async Task Register() {    // Login using AD OAuth    if (!this.Login())        return;      // Get header from auth token    var authHeader = this.Token.CreateAuthorizationHeader();      // Create hub proxy and add auth token    var cnString = string.Format("{0}signalr", base.ServiceUri);    var hubConnection = new HubConnection(cnString, useDefaultUrl:      false);    this._proxy = hubConnection.CreateHubProxy("managementHub");    hubConnection.Headers.Add("Authorization", authHeader);      // Register for order updates    this._proxy.On<Order>("updateOrder", order =>    {        this.OnOrderUpdated(order);    });        // Register for batch updates    this._proxy.On<Batch>("updateBatch", batch =>    {        this.OnBatchUpdated(batch);    });        // Start hub connection    await hubConnection.Start(); } The OnOrderUpdated and OnBatchUpdated methods call events to notify about updates. Now, add two methods that call the hub methods we created in the website using the IHubProxy.Invoke<T> method: public async Task<bool> UpdateOrder(Order order) {    // Invoke updateOrder method on hub    await this._proxy.Invoke<Order>("updateOrder",      order).ContinueWith(task =>    {        return !task.IsFaulted;    });      return false; }   public async Task<bool> UpdateBatch(Batch batch) {    // Invoke updateBatch method on hub    await this._proxy.Invoke<Batch>("updateBatch",      batch).ContinueWith(task =>    {        return !task.IsFaulted;    });      return false; } This service implements an IManagementService interface and is registered in the ViewModelLocator class, ready to be injected into our view models like this: SimpleIoc.Default.Register<IManagementService, ManagementService>(); Testing the application To test the application locally, we need to start the Web API project and the WPF client application at the same time. So, under the Startup Project section in the Solution Properties dialog, check Multiple startup projects, select the two applications, and click on OK: Once running, we can easily debug both applications simultaneously. To test the application with the service running in the cloud, we need to deploy the service to the cloud, and then change the settings in the client app.config file (remember we put the local and Azure settings in the config with the Azure settings commented-out, so swap them around). To debug the client against the Azure service, make sure that only the client application is running (select Single startup project from the Solution Properties dialog). Summary We learned how to use a Web API to enable the production management Windows client application to access data from our production database and a SignalR hub to handle order and batch changes, keeping all clients updated and messaging the Service Bus topic. Resources for Article: Further resources on this subject: Using the Windows Azure Platform PowerShell Cmdlets [Article] Windows Azure Mobile Services - Implementing Push Notifications using [Article] Using Azure BizTalk Features [Article]
Read more
  • 0
  • 0
  • 1941
article-image-building-publishing-and-supporting-your-forcecom-application
Packt
22 Sep 2014
39 min read
Save for later

Building, Publishing, and Supporting Your Force.com Application

Packt
22 Sep 2014
39 min read
In this article by Andrew Fawcett, the author of Force.com Enterprise Architecture, we will use the declarative aspects of the platform to quickly build an initial version of an application, which will give you an opportunity to get some hands-on experience with some of the packaging and installation features that are needed to release applications to subscribers. We will also take a look at the facilities available to publish your application through Salesforce AppExchange (equivalent to the Apple App Store) and finally provide end user support. (For more resources related to this topic, see here.) We will then use this application as a basis for incrementally releasing new versions of the application to build our understanding of Enterprise Application Development. The following topics outline what we will achieve in this article: Required organizations Introducing the sample application Package types and benefits Creating your first managed package Package dependencies and uploading Introduction to AppExchange and creating listings Installing and testing your package Becoming a Salesforce partner and its benefits Licensing Supporting your application Customer metrics Trialforce and Test Drive Required organizations Several Salesforce organizations are required to develop, package, and test your application. You can sign up for these organizations at https://developer.salesforce.com/, though in due course, as your relationship with Salesforce becomes more formal, you will have the option of accessing their Partner Portal website to create organizations of different types and capabilities. We will discuss more on this later. It's a good idea to have some kind of naming convention to keep track of the different organizations and logins. Use the following table as a guide and create the following organizations via https://developer.salesforce.com/. As stated earlier, these organizations will be used only for the purposes of learning and exploring: Username Usage Purpose myapp@packaging.my.com Packaging Though we will perform initial work in this org, it will eventually be reserved solely for assembling and uploading a release. myapp@testing.my.com Testing In this org, we will install the application and test upgrades. You may want to create several of these in practice, via the Partner Portal website described later in this article. myapp@dev.my.com Developing Later, we will shift development of the application into this org, leaving the packaging org to focus only on packaging. You will have to substitute myapp and my.com (perhaps by reusing your company domain name to avoid naming conflicts) with your own values. Take time to take note of andyapp@packaging.andyinthecloud.com. The following are other organization types that you will eventually need in order to manage the publication and licensing of your application. Usage Purpose Production / CRM Org Your organization may already be using this org for managing contacts, leads, opportunities, cases, and other CRM objects. Make sure that you have the complete authority to make changes, if any, to this org since this is where you run your business. If you do not have such an org, you can request one via the Partner Program website described later in this article, by requesting (via a case) a CRM ISV org. Even if you choose to not fully adopt Salesforce for this part of your business, such an org is still required when it comes to utilizing the licensing aspects of the platform. AppExchange Publishing Org (APO) This org is used to manage your use of AppExchange. We will discuss this a little later in this article. This org is actually the same Salesforce org you designate as your production org, where you conduct your sales and support activities from. License Management Org (LMO) Within this organization, you can track who installs your application (as leads), the licenses you grant to them, and for how long. It is recommended that this is the same org as the APO described earlier. Trialforce Management Org (TMO) Trialforce is a way to provide orgs with your preconfigured application data for prospective customers to try out your application before buying. It will be discussed later in this article. Trialforce Source Org (TSO)   Typically, the LMO and APO can be the same as your primary Salesforce production org, which allows you to track all your leads and future opportunities in the same place. This leads to the rule of APO = LMO = production org. Though neither of them should be your actual developer or test orgs. you can work with Salesforce support and your Salesforce account manager to plan and assign these orgs. Introducing the sample application For this article, we will use the world of Formula1 motor car racing as the basis for a packaged application that we will build together. Formula1 is for me the motor sport that is equivalent to Enterprise applications software, due to its scale and complexity. It is also a sport that I follow, both of which helped me when building the examples that we will use. We will refer to this application as FormulaForce, though please keep in mind Salesforce's branding policies when naming your own application, as they prevent the use of the word "Force" in the company or product titles. This application will focus on the data collection aspects of the races, drivers, and their many statistics, utilizing platform features to structure, visualize, and process this data in both historic and current contexts. For this article, we will create some initial Custom Objects as detailed in the following table. Do not worry about creating any custom tabs just yet. You can use your preferred approach for creating these initial objects. Ensure that you are logged in to your packaging org. Object Field name and type Season__c Name (text) Race__c Name (text) Season__c (Master-Detail to Season__c) Driver__c Name Contestant__c Name (Auto Number, CONTESTANT-{00000000} ) Race__c (Master-Detail to Race__c) Driver__c (Lookup to Driver__c) The following screenshot shows the preceding objects within the Schema Builder tool, available under the Setup menu: Package types and benefits A package is a container that holds your application components such as Custom Objects, Apex code, Apex triggers, Visualforce pages, and so on. This makes up your application. While there are other ways to move components between Salesforce orgs, a package provides a container that you can use for your entire application or deliver optional features by leveraging the so-called extension packages. There are two types of packages, managed and unmanaged. Unmanaged packages result in the transfer of components from one org to another; however, the result is as if those components had been originally created in the destination org, meaning that they can be readily modified or even deleted by the administrator of that org. They are also not upgradable and are not particularly ideal from a support perspective. Moreover, the Apex code that you write is also visible for all to see, so your Intellectual Property is at risk. Unmanaged packages can be used for sharing template components that are intended to be changed by the subscriber. If you are not using GitHub and the GitHub Salesforce Deployment Tool (https://github.com/afawcett/githubsfdeploy), they can also provide a means to share open source libraries to developers. Features and benefits of managed packages Managed packages have the following features that are ideal for distributing your application. The org where your application package is installed is referred to as a subscriber org, since users of this org are subscribing to the services your application provides: Intellectual Property (IP) protection: Users in the subscriber org cannot see your Apex source code, although they can see your Visualforce pages code and static resources. While the Apex code is hidden, JavaScript code is not, so you may want to consider using a minify process to partially obscure such code. The naming scope: Your component names are unique to your package throughout the utilization of a namespace. This means that even if you have object X in your application, and the subscriber has an object of the same name, they remain distinct. You will define a namespace later in this article. The governor scope: Code in your application executes within its own governor limit scope (such as DML and SOQL governors that are subject to passing Salesforce Security Review) and is not affected by other applications or code within the subscriber org. Note that some governors such as the CPU time governor are shared by the whole execution context (discussed in a later article) regardless of the namespace. Upgrades and versioning: Once the subscribers have started using your application, creating data, making configurations, and so on, you will want to provide upgrades and patches with new versions of your application. There are other benefits to managed packages, but these are only accessible after becoming a Salesforce Partner and completing the security review process; these benefits are described later in this article. Salesforce provides ISVForce Guide (otherwise known as the Packaging Guide) in which these topics are discussed in depth; bookmark it now! The following is the URL for ISVForce Guide: http://login.salesforce.com/help/pdfs/en/salesforce_packaging_guide.pdf. Creating your first managed package Packages are created in your packaging org. There can be only one managed package being developed in your packaging org (though additional unmanaged packages are supported, it is not recommended to mix your packaging org with them). You can also install other dependent managed packages and reference their components from your application. The steps to be performed are discussed in the following sections: Setting your package namespace Creating the package and assigning it to the namespace Adding components to the package Setting your package namespace An important decision when creating a managed package is the namespace; this is a prefix applied to all your components (Custom Objects, Visualforce pages, and so on) and is used by developers in subscriber orgs to uniquely qualify between your packaged components and others, even those from other packages. The namespace prefix is an important part of the branding of the application since it is implicitly attached to any Apex code or other components that you include in your package. It can be up to 15 characters, though I personally recommend that you keep it less than this, as it becomes hard to remember and leads to frustrating typos if you make it too complicated. I would also avoid underscore characters as well. It is a good idea to have a naming convention if you are likely to create more managed packages in the future (in different packaging orgs). The following is the format of an example naming convention: [company acronym - 1 to 4 characters][package prefix 1 to 4 characters] For example, the ACME Corporation's Road Runner application might be named acmerr. When the namespace has not been set, the Packages page (accessed under the Setup menu under the Create submenu) indicates that only unmanaged packages can be created. Click on the Edit button to begin a small wizard to enter your desired namespace. This can only be done once and must be globally unique (meaning it cannot be set in any other org), much like a website domain name. The following screenshot shows the Packages page: Once you have set the namespace, the preceding page should look like the following screenshot with the only difference being the namespace prefix that you have used. You are now ready to create a managed package and assign it to the namespace. Creating the package and assigning it to the namespace Click on the New button on the Packages page and give your package a name (it can be changed later). Make sure to tick the Managed checkbox as well. Click on Save and return to the Packages page, which should now look like the following: Adding components to the package In the Packages page, click on the link to your package in order to view its details. From this page, you can manage the contents of your package and upload it. Click on the Add button to add the Custom Objects created earlier in this article. Note that you do not need to add any custom fields; these are added automatically. The following screenshot shows broadly what your Package Details page should look like at this stage: When you review the components added to the package, you will see that some components can be removed while other components cannot be removed. This is because the platform implicitly adds some components for you as they are dependencies. As we progress, adding different component types, you will see this list automatically grow in some cases, and in others, we must explicitly add them. Extension packages As the name suggests, extension packages extend or add to the functionality delivered by the existing packages they are based on, though they cannot change the base package contents. They can extend one or more base packages, and you can even have several layers of extension packages, though you may want to keep an eye on how extensively you use this feature, as managing inter-package dependency can get quite complex to manage, especially during development. Extension packages are created in pretty much the same way as the process you've just completed (including requiring their own packaging org), except that the packaging org must also have the dependent packages installed in it. As code and Visualforce pages contained within extension packages make reference to other Custom Objects, fields, Apex code, and Visualforce pages present in base packages. The platform tracks these dependencies and the version of the base package present at the time the reference was made. When an extension package is installed, this dependency information ensures that the subscriber org must have the correct version (minimum) of the base packages installed before permitting the installation to complete. You can also manage the dependencies between extension packages and base packages yourself through the Versions tab or XML metadata for applicable components. Package dependencies and uploading Packages can have dependencies on platform features and/or other packages. You can review and manage these dependencies through the usage of the Package detail page and the use of dynamic coding conventions as described here. While some features of Salesforce are common, customers can purchase different editions and features according to their needs. Developer Edition organizations have access to most of these features for free. This means that as you develop your application, it is important to understand when and when not to use those features. By default, when referencing a certain Standard Object, field, or component type, you will generate a prerequisite dependency on your package, which your customers will need to have before they can complete the installation. Some Salesforce features, for example Multi-Currency or Chatter, have either a configuration or, in some cases, a cost impact to your users (different org editions). Carefully consider which features your package is dependent on. Most of the feature dependencies, though not all, are visible via the View Dependencies button on the Package details page (this information is also available on the Upload page, allowing you to make a final check). It is a good practice to add this check into your packaging procedures to ensure that no unwanted dependencies have crept in. Clicking on this button, for the package that we have been building in this article so far, confirms that there are no dependencies. Uploading the release and beta packages Once you have checked your dependencies, click on the Upload button. You will be prompted to give a name and version to your package. The version will be managed for you in subsequent releases. Packages are uploaded in one of two modes (beta or release). We will perform a release upload by selecting the Managed - Released option from the Release Type field, so make sure you are happy with the objects created in the earlier section of this article, as they cannot easily be changed after this point. Once you are happy with the information on the screen, click on the Upload button once again to begin the packaging process. Once the upload process completes, you will see a confirmation page as follows: Packages can be uploaded in one of two states as described here: Release packages can be installed into subscriber production orgs and also provide an upgrade path from previous releases. The downside is that you cannot delete the previously released components and change certain things such as a field's type. Changes to the components that are marked global, such as Apex Code and Visualforce components, are also restricted. While Salesforce is gradually enhancing the platform to provide the ability to modify certain released aspects, you need to be certain that your application release is stable before selecting this option. Beta packages cannot be installed into subscriber production orgs; you can install only into Developer Edition (such as your testing org), sandbox, or Partner Portal created orgs. Also, Beta packages cannot be upgraded once installed; hence, this is the reason why Salesforce does not permit their installation into production orgs. The key benefit is in the ability to continue to change new components of the release, to address bugs and features relating to user feedback. The ability to delete previously-published components (uploaded within a release package) is in pilot. It can be enabled through raising a support case with Salesforce Support. Once you have understood the full implications, they will enable it. We have simply added some Custom Objects. So, the upload should complete reasonably quickly. Note that what you're actually uploading to is AppExchange, which will be covered in the following sections. If you want to protect your package, you can provide a password (this can be changed afterwards). The user performing the installation will be prompted for it during the installation process. Optional package dependencies It is possible to make some Salesforce features and/or base package component references (Custom Objects and fields) an optional aspect of your application. There are two approaches to this, depending on the type of the feature. Dynamic Apex and Visualforce For example, the Multi-Currency feature adds a CurrencyIsoCode field to the standard and Custom Objects. If you explicitly reference this field, for example in your Apex or Visualforce pages, you will incur a hard dependency on your package. If you want to avoid this and make it a configuration option (for example) in your application, you can utilize dynamic Apex and Visualforce. Extension packages If you wish to package component types that are only available in subscriber orgs of certain editions, you can choose to include these in extension packages. For example, you may wish to support Professional Edition, which does not support record types. In this case, create an Enterprise Edition extension package for your application's functionality, which leverages the functionality from this edition. Note that you will need multiple testing organizations for each combination of features that you utilize in this way, to effectively test the configuration options or installation options that your application requires. Introduction to AppExchange and listings Salesforce provides a website referred to as AppExchange, which lets prospective customers find, try out, and install applications built using Force.com. Applications listed here can also receive ratings and feedback. You can also list your mobile applications on this site as well. In this section, I will be using an AppExchange package that I already own. The package has already gone through the process to help illustrate the steps that are involved. For this reason, you do not need to perform these steps; they can be revisited at a later phase in your development once you're happy to start promoting your application. Once your package is known to AppExchange, each time you click on the Upload button on your released package (as described previously), you effectively create a private listing. Private listings are not visible to the public until you decide to make them so. It gives you the chance to prepare any relevant marketing details and pricing information while final testing is completed. Note that you can still distribute your package to other Salesforce users or even early beta or pilot customers without having to make your listing public. In order to start building a listing, you need to log in to AppExchange using the login details you designated to your AppExchange Publishing Org (APO). Go to www.appexchange.com and click on Login in the banner at the top-right corner. This will present you with the usual Salesforce login screen. Once logged in, you should see something like this: Select the Publishing Console option from the menu, then click on the Create New Listing button and complete the steps shown in the wizard to associate the packaging org with AppExchange; once completed, you should see it listed. It's really important that you consistently log in to AppExchange using your APO user credentials. Salesforce will let you log in with other users. To make it easy to confirm, consider changing the user's display name to something like MyCompany Packaging. Though it is not a requirement to complete the listing steps, unless you want to try out the process yourself a little further to see the type of information required, you can delete any private listings that you created after you complete this app. Installing and testing your package When you uploaded your package earlier in this article, you will receive an e-mail with a link to install the package. If not, review the Versions tab on the Package detail page in your packaging org. Ensure that you're logged out and click on the link. When prompted, log in to your testing org. The installation process will start. A reduced screenshot of the initial installation page is shown in the following screenshot; click on the Continue button and follow the default installation prompts to complete the installation: Package installation covers the following aspects (once the user has entered the package password if one was set): Package overview: The platform provides an overview of the components that will be added or updated (if this is an upgrade) to the user. Note that due to the namespace assigned to your package, these will not overwrite existing components in the subscriber org created by the subscriber. Connected App and Remote Access: If the package contains components that represent connections to the services outside of the Salesforce services, the user is prompted to approve these. Approve Package API Access: If the package contains components that make use of the client API (such as JavaScript code), the user is prompted to confirm and/or configure this. Such components will generally not be called much; features such as JavaScript Remoting are preferred, and they leverage the Apex runtime security configured post install. Security configuration: In this step, you can determine the initial visibility of the components being installed (objects, pages, and so on). Selecting admin only or the ability to select Profiles to be updated. This option predates the introduction of permission sets, which permit post installation configuration. If you package profiles in your application, the user will need to remember to map these to the existing profiles in the subscriber org as per step 2. This is a one-time option, as the profiles in the package are not actually installed, only merged. I recommend that you utilize permission sets to provide security configurations for your application. These are installed and are much more granular in nature. When the installation is complete, navigate to the Installed Packages menu option under the Setup menu. Here, you can see confirmation of some of your package details such as namespace and version, as well as any licensing details, which will be discussed later in this article. It is also possible to provide a Configure link for your package, which will be displayed next to the package when installed and listed on the Installed Packages page in the subscriber org. Here, you can provide a Visualforce page to access configuration options and processes for example. If you have enabled Seat based licensing, there will also be a Manage Licenses link to determine which users in the subscriber org have access to your package components such as tabs, objects, and Visualforce pages. Licensing, in general, is discussed in more detail later in this article. Automating package installation It is possible to automate some of the processes using the Salesforce Metadata API and associated tools, such as the Salesforce Migration Toolkit (available from the Tools menu under Setup), which can be run from the popular Apache Ant scripting environment. This can be useful if you want to automate the deployment of your packages to customers or test orgs. Options that require a user response such as the security configuration are not covered by automation. However, password-protected managed packages are supported. You can find more details on this by looking up the Installed Package component in the online help for the Salesforce Metadata API at https://www.salesforce.com/us/developer/docs/api_meta/. As an aid to performing this from Ant, a custom Ant task can be found in the sample code related to this article (see /lib/antsalesforce.xml). The following is a /build.xml Ant script to uninstall and reinstall the package. Note that the installation will also upgrade a package if the package is already installed. The following is the Ant script: <project name="FormulaForce" basedir="."> <!-- Downloaded from Salesforce Tools page under Setup --> <typedef uri="antlib:com.salesforce" resource="com/salesforce/antlib.xml" classpath="${basedir}/lib/ant-salesforce.jar"/> <!-- Import macros to install/uninstall packages --> <import file="${basedir}/lib/ant-salesforce.xml"/> <target name="package.installdemo"> <uninstallPackage namespace="yournamespace" username="${sf.username}" password="${sf.password}"/> <installPackage namespace="yournamespace" version="1.0" username="${sf.username}" password="${sf.password}"/> </target> </project> You can try the preceding example with your testing org by replacing the namespace attribute values with the namespace you entered earlier in this article. Enter the following commands, all on one line, from the folder that contains the build.xml file: ant package.installdemo -Dsf.username=testorgusername -Dsf.password=testorgpasswordtestorgtoken You can also use the Salesforce Metadata API to list packages installed in an org, for example, if you wanted to determine whether a dependent package needs to be installed or upgraded before sending an installation request. Finally, you can also uninstall packages if you wish. Becoming a Salesforce partner and benefits The Salesforce Partner Program has many advantages. The first place to visit is http://www.salesforce.com/partners/overview. You will want to focus on the areas of the site relating to being an Independent Software Vendor (ISV) partner. From here, you can click on Join. It is free to join, though you will want to read through the various agreements carefully of course. Once you wish to start listing a package and charging users for it, you will need to arrange billing details for Salesforce to take the various fees involved. Pay careful attention to the Standard Objects used in your package, as this will determine the license type required by your users and the overall cost to them in addition to your charges. Obviously, Salesforce would prefer your application to use as many features of the CRM application as possible, which may also be beneficial to you as a feature of your application, since it's an appealing immediate integration not found on other platforms, such as the ability to instantly integrate with accounts and contacts. If you're planning on using Standard Objects and are in doubt about the costs (as they do vary depending on the type), you can request a conversation with Salesforce to discuss this; this is something to keep in mind in the early stages. Once you have completed the signup process, you will gain access to the Partner Portal (your user will end with @partnerforce.com). You must log in to the specific site as opposed to the standard Salesforce login; currently, the URL is https://www.salesforce.com/partners/login. Starting from July 2014, the http://partners.salesforce.com URL provides access to the Partner Community. Logging in to this service using your production org user credentials is recommended. The following screenshot shows what the current Partner Portal home page looks like. Here you can see some of its key features: This is your primary place to communicate with Salesforce and also to access additional materials and announcements relevant to ISVs, so do keep checking often. You can raise cases and provide additional logins to other users in your organization, such as other developers who may wish to report issues or ask questions. There is also the facility to create test or developer orgs; here, you can choose the appropriate edition (Professional, Group, Enterprise, and others) you want to test against. You can also create Partner Developer Edition orgs from this option as well. These carry additional licenses and limits over the public's so-called Single Developer Editions orgs and are thus recommended for use only once you start using the Partner Portal. Note, however, that these orgs do expire, subject to either continued activity over 6 months or renewing the security review process (described in the following section) each year. Once you click on the create a test org button, there is a link on the page displayed that navigates to a table that describes the benefits, processes, and the expiry rules. Security review and benefits The following features require that a completed package release goes through a Salesforce-driven process known as the Security review, which is initiated via your listing when logged into AppExchange. Unless you plan to give your package away for free, there is a charge involved in putting your package through this process. However, the review is optional. There is nothing stopping you from distributing your package installation URL directly. However, you will not be able to benefit from the ability to list your new application on AppExchange for others to see and review. More importantly, you will also not have access to the following features to help you deploy, license, and support your application. The following is a list of the benefits you get once your package has passed the security review: Bypass subscriber org setup limits: Limits such as the number of tabs and Custom Objects are bypassed. This means that if the subscriber org has reached its maximum number of Custom Objects, your package will still install. This feature is sometimes referred to as Aloha. Without this, your package installation may fail. You can determine whether Aloha has been enabled via the Subscriber Overview page that comes with the LMA application, which is discussed in the next section. Licensing: You are able to utilize the Salesforce-provided License Management Application in your LMO (License Management Org as described previously). Subscriber support: With this feature, the users in the subscriber org can enable, for a specific period, a means for you to log in to their org (without exchanging passwords), reproduce issues, and enable much more detailed debug information such as Apex stack traces. In this mode, you can also see custom settings that you have declared as protected in your package, which are useful for enabling additional debug or advanced features. Push upgrade: Using this feature, you can automatically apply upgrades to your subscribers without their manual intervention, either directly by you or on a scheduled basis. You may use this for applying either smaller bug fixes that don't affect the Custom Objects or APIs or deploy full upgrades. The latter requires careful coordination and planning with your subscribers to ensure that changes and new features are adopted properly. Salesforce asks you to perform an automated security scan of your software via a web page (http://security.force.com/security/tools/forcecom/scanner). This service can be quite slow depending on how many scans are in the queue. Another option is to obtain the Eclipse plugin from the actual vendor CheckMarx at http://www.checkmarx.com, which runs the same scan but allows you to control it locally. Finally, for the ultimate confidence as you develop your application, Salesforce can provide a license to integrate it into your Continuous Integration (CI) build system. Keep in mind that if you make any callouts to external services, Salesforce will also most likely ask you and/or the service provider to run a BURP scanner, to check for security flaws. Make sure you plan a reasonable amount of time (at least 2–3 weeks, in my experience) to go through the security review process; it is a must to initially list your package, though if it becomes an issue, you have the option of issuing your package install URL directly to initial customers and early adopters. Licensing Once you have completed the security review, you are able to request through raising support cases via the Partner Portal to have access to the LMA. Once this is provided by Salesforce, use the installation URL to install it like any other package into your LMO. If you have requested a CRM for ISV's org (through a case raised within the Partner Portal), you may find the LMA already installed. The following screenshot shows the main tabs of the License Management Application once installed: In this section, I will use a package that I already own and have already taken through the process to help illustrate the steps that are involved. For this reason, you do not need to perform these steps. After completing the installation, return to AppExchange and log in. Then, locate your listing in Publisher Console under Uploaded Packages. Next to your package, there will be a Manage Licenses link. The first time after clicking on this link, you will be asked to connect your package to your LMO org. Once this is done, you will be able to define the license requirements for your package. The following example shows the license for a free package, with an immediately active license for all users in the subscriber org: In most cases, for packages that you intend to charge for, you would select a free trial rather than setting the license default to active immediately. For paid packages, select a license length, unless perhaps it's a one-off charge, and then select the license that does not expire. Finally, if you're providing a trial license, you need to consider carefully the default number of seats (users); users may need to be able to assign themselves different roles in your application to get the full experience. While licensing is expressed at a package level currently, it is very likely that more granular licensing around the modules or features in your package will be provided by Salesforce in the future. This will likely be driven by the Permission Sets feature. As such, keep in mind a functional orientation to your Permission Set design. The Manage Licenses link is shown on the Installed Packages page next to your package if you configure a number of seats against the license. The administrator in the subscriber org can use this page to assign applicable users to your package. The following screenshot shows how your installed package looks to the administrator when the package has licensing enabled: Note that you do not need to keep reapplying the license requirements for each version you upload; the last details you defined will be carried forward to new versions of your package until you change them. Either way, these details can also be completely overridden on the License page of the LMA application as well. You may want to apply a site-wide (org-wide) active license to extensions or add-on packages. This allows you to at least track who has installed such packages even though you don't intend to manage any licenses around them, since you are addressing licensing on the main package. The Licenses tab and managing customer licenses The Licenses tab provides a list of individual license records that are automatically generated when the users install your package into their orgs. Salesforce captures this action and creates the relevant details, including Lead information, and also contains contact details of the organization and person who performed the install, as shown in the following screenshot: From each of these records, you can modify the current license details to extend the expiry period or disable the application completely. If you do this, the package will remain installed with all of its data. However, none of the users will be able to access the objects, Apex code, or pages, not even the administrator. You can also re-enable the license at any time. The following screenshot shows the License Edit section: The Subscribers tab The Subscribers tab lists all your customers or subscribers (it shows their Organization Name from the company profile) that have your packages installed (only those linked via AppExchange). This includes their organization ID, edition (Developer, Enterprise, or others), and also the type of instance (sandbox or production). The Subscriber Overview page When you click on Organization Name from the list in this tab, you are taken to the Subscriber Overview page. This page is sometimes known as the Partner Black Tab. This page is packed with useful information such as the contact details (also seen via the Leads tab) and the login access that may have been granted (we will discuss this in more detail in the next section), as well as which of your packages they have installed, its current licensed status, and when it was installed. The following is a screenshot of the Subscriber Overview page: How licensing is enforced in the subscriber org Licensing is enforced in one of two ways, depending on the execution context in which your packaged Custom Objects, fields, and Apex code are being accessed from. The first context is where a user is interacting directly with your objects, fields, tabs, and pages via the user interface or via the Salesforce APIs (Partner and Enterprise). If the user or the organization is not licensed for your package, these will simply be hidden from view, and in the case of the API, return an error. Note that administrators can still see packaged components under the Setup menu. The second context is the type of access made from Apex code, such as an Apex trigger or controller, written by the customers themselves or from within another package. This indirect way of accessing your package components is permitted if the license is site (org) wide or there is at least one user in the organization that is allocated a seat. This condition means that even if the current user has not been assigned a seat (via the Manage Licenses link), they are still accessing your application's objects and code, although indirectly, for example, via a customer-specific utility page or Apex trigger, which automates the creation of some records or defaulting of fields in your package. Your application's Apex triggers (for example, the ones you might add to Standard Objects) will always execute even if the user does not have a seat license, as long as there is just one user seat license assigned in the subscriber org to your package. However, if that license expires, the Apex trigger will no longer be executed by the platform, until the license expiry is extended. Providing support Once your package has completed the security review, additional functionality for supporting your customers is enabled. Specifically, this includes the ability to log in securely (without exchanging passwords) to their environments and debug your application. When logged in this way, you can see everything the user sees in addition to extended Debug Logs that contain the same level of details as they would in a developer org. First, your customer enables access via the Grant Account Login page. This time however, your organization (note that this is the Company Name as defined in the packaging org under Company Profile) will be listed as one of those available in addition to Salesforce Support. The following screenshot shows the Grant Account Login page: Next, you log in to your LMO and navigate to the Subscribers tab as described. Open Subscriber Overview for the customer, and you should now see the link to Login as that user. From this point on, you can follow the steps given to you by your customer and utilize the standard Debug Log and Developer Console tools to capture the debug information you need. The following screenshot shows a user who has been granted login access via your package to their org: This mode of access also permits you to see protected custom settings if you have included any of those in your package. If you have not encountered these before, it's well worth researching them as they provide an ideal way to enable and disable debug, diagnostic, or advanced configurations that you don't want your customers to normally see. Customer metrics Salesforce has started to expose information relating to the usage of your package components in the subscriber orgs since the Spring '14 release of the platform. This enables you to report what Custom Objects and Visualforce pages your customers are using and more importantly those they are not. This information is provided by Salesforce and cannot be opted out by the customer. At the time of writing, this facility is in pilot and needs to be enabled by Salesforce Support. Once enabled, the MetricsDataFile object is available in your production org and will receive a data file periodically that contains the metrics records. The Usage Metrics Visualization application can be found by searching on AppExchange and can help with visualizing this information. Trialforce and Test Drive Large enterprise applications often require some consultation with customers to tune and customize to their needs after the initial package installation. If you wish to provide trial versions of your application, Salesforce provides a means to take snapshots of the results of this installation and setup process, including sample data. You can then allow prospects that visit your AppExchange listing or your website to sign up to receive a personalized instance of a Salesforce org based on the snapshot you made. The potential customers can then use this to fully explore the application for a limited duration until they sign up to be a paid customer from the trial version. Such orgs will eventually expire when the Salesforce trial period ends for the org created (typically 14 days). Thus, you should keep this in mind when setting the default expiry on your package licensing. The standard approach is to offer a web form for the prospect to complete in order to obtain the trial. Review the Providing a Free Trial on your Website and Providing a Free Trial on AppExchange sections of the ISVForce Guide for more on this. You can also consider utilizing the Signup Request API, which gives you more control over how the process is started and the ability to monitor it, such that you can create the lead records yourself. You can find out more about this in the Creating Signups using the API section in the ISVForce Guide. Alternatively, if the prospect wishes to try your package in their sandbox environment for example, you can permit them to install the package directly either from AppExchange or from your website. In this case, ensure that you have defined a default expiry on your package license as described earlier. In this scenario, you or the prospect will have to perform the setup steps after installation. Finally, there is a third option called Test Drive, which does not create a new org for the prospect on request, but does require you to set up an org with your application, preconfigure it, and then link it to your listing via AppExchange. Instead of the users completing a signup page, they click on the Test Drive button on your AppExchange listing. This logs them into your test drive org as a read-only user. Because this is a shared org, the user experience and features you can offer to users is limited to those that mainly read information. I recommend that you consider Trialforce over this option unless there is some really compelling reason to use it. When defining your listing in AppExchange, the Leads tab can be used to configure the creation of lead records for trials, test drives, and other activities on your listing. Enabling this will result in a form being presented to the user before accessing these features on your listing. If you provide access to trials through signup forms on your website for example, lead information will not be captured. Summary This article has given you a practical overview of the initial package creation process through installing it into another Salesforce organization. While some of the features discussed cannot be fully exercised until you're close to your first release phase, you can now head to development with a good understanding of how early decisions such as references to Standard Objects are critical to your licensing and cost decisions. It is also important to keep in mind that while tools such as Trialforce help automate the setup, this does not apply to installing and configuring your customer environments. Thus, when making choices regarding configurations and defaults in your design, keep in mind the costs to the customer during the implementation cycle. Make sure you plan for the security review process in your release cycle (the free online version has a limited bandwidth) and ideally integrate it into your CI build system (a paid facility) as early as possible, since the tool not only monitors security flaws but also helps report breaches in best practices such as lack of test asserts and SOQL or DML statements in loops. As you revisit the tools covered in this article, be sure to reference the excellent ISVForce Guide at http://www.salesforce.com/us/developer/docs/packagingGuide/index.htm for the latest detailed steps and instructions on how to access, configure, and use these features. Resources for Article: Further resources on this subject: Salesforce CRM Functions [Article] Force.com: Data Management [Article] Configuration in Salesforce CRM [Article]
Read more
  • 0
  • 0
  • 1856

article-image-hdfs-and-mapreduce
Packt
17 Jul 2014
11 min read
Save for later

HDFS and MapReduce

Packt
17 Jul 2014
11 min read
(For more resources related to this topic, see here.) Essentials of HDFS HDFS is a distributed filesystem that has been designed to run on top of a cluster of industry standard hardware. The architecture of HDFS is such that there is no specific need for high-end hardware. HDFS is a highly fault-tolerant system and can handle failures of nodes in a cluster without loss of data. The primary goal behind the design of HDFS is to serve large data files efficiently. HDFS achieves this efficiency and high throughput in data transfer by enabling streaming access to the data in the filesystem. The following are the important features of HDFS: Fault tolerance: Many computers working together as a cluster are bound to have hardware failures. Hardware failures such as disk failures, network connectivity issues, and RAM failures could disrupt processing and cause major downtime. This could lead to data loss as well slippage of critical SLAs. HDFS is designed to withstand such hardware failures by detecting faults and taking recovery actions as required. The data in HDFS is split across the machines in the cluster as chunks of data called blocks. These blocks are replicated across multiple machines of the cluster for redundancy. So, even if a node/machine becomes completely unusable and shuts down, the processing can go on with the copy of the data present on the nodes where the data was replicated. Streaming data: Streaming access enables data to be transferred in the form of a steady and continuous stream. This means if data from a file in HDFS needs to be processed, HDFS starts sending the data as it reads the file and does not wait for the entire file to be read. The client who is consuming this data starts processing the data immediately, as it receives the stream from HDFS. This makes data processing really fast. Large data store: HDFS is used to store large volumes of data. HDFS functions best when the individual data files stored are large files, rather than having large number of small files. File sizes in most Hadoop clusters range from gigabytes to terabytes. The storage scales linearly as more nodes are added to the cluster. Portable: HDFS is a highly portable system. Since it is built on Java, any machine or operating system that can run Java should be able to run HDFS. Even at the hardware layer, HDFS is flexible and runs on most of the commonly available hardware platforms. Most production level clusters have been set up on commodity hardware. Easy interface: The HDFS command-line interface is very similar to any Linux/Unix system. The commands are similar in most cases. So, if one is comfortable with the performing basic file actions in Linux/Unix, using commands with HDFS should be very easy. The following two daemons are responsible for operations on HDFS: Namenode Datanode The namenode and datanodes daemons talk to each other via TCP/IP. Configuring HDFS All HDFS-related configuration is done by adding/updating the properties in the hdfs-site.xml file that is found in the conf folder under the Hadoop installation folder. The following are the different properties that are part of the hdfs-site.xml file: dfs.namenode.servicerpc-address: This specifies the unique namenode RPC address in the cluster. Services/daemons such as the secondary namenode and datanode daemons use this address to connect to the namenode daemon whenever it needs to communicate. This property is shown in the following code snippet: <property> <name>dfs.namenode.servicerpc-address</name> <value>node1.hcluster:8022</value> </property> dfs.namenode.http-address: This specifies the URL that can be used to monitor the namenode daemon from a browser. This property is shown in the following code snippet: <property> <name>dfs.namenode.http-address</name> <value>node1.hcluster:50070</value> </property> dfs.replication: This specifies the replication factor for data block replication across the datanode daemons. The default is 3 as shown in the following code snippet: <property> <name>dfs.replication</name> <value>3</value> </property> dfs.blocksize: This specifies the block size. In the following example, the size is specified in bytes (134,217,728 bytes is 128 MB): <property> <name>dfs.blocksize</name> <value>134217728</value> </property> fs.permissions.umask-mode: This specifies the umask value that will be used when creating files and directories in HDFS. This property is shown in the following code snippet: <property> <name>fs.permissions.umask-mode</name> <value>022</value> </property> The read/write operational flow in HDFS To get a better understanding of HDFS, we need to understand the flow of operations for the following two scenarios: A file is written to HDFS A file is read from HDFS HDFS uses a single-write, multiple-read model, where the files are written once and read several times. The data cannot be altered once written. However, data can be appended to the file by reopening it. All files in the HDFS are saved as data blocks. Writing files in HDFS The following sequence of steps occur when a client tries to write a file to HDFS: The client informs the namenode daemon that it wants to write a file. The namenode daemon checks to see whether the file already exists. If it exists, an appropriate message is sent back to the client. If it does not exist, the namenode daemon makes a metadata entry for the new file. The file to be written is split into data packets at the client end and a data queue is built. The packets in the queue are then streamed to the datanodes in the cluster. The list of datanodes is given by the namenode daemon, which is prepared based on the data replication factor configured. A pipeline is built to perform the writes to all datanodes provided by the namenode daemon. The first packet from the data queue is then transferred to the first datanode daemon. The block is stored on the first datanode daemon and is then copied to the next datanode daemon in the pipeline. This process goes on till the packet is written to the last datanode daemon in the pipeline. The sequence is repeated for all the packets in the data queue. For every packet written on the datanode daemon, a corresponding acknowledgement is sent back to the client. If a packet fails to write onto one of the datanodes, the datanode daemon is removed from the pipeline and the remainder of the packets is written to the good datanodes. The namenode daemon notices the under-replication of the block and arranges for another datanode daemon where the block could be replicated. After all the packets are written, the client performs a close action, indicating that the packets in the data queue have been completely transferred. The client informs the namenode daemon that the write operation is now complete. The following diagram shows the data block replication process across the datanodes during a write operation in HDFS: Reading files in HDFS The following steps occur when a client tries to read a file in HDFS: The client contacts the namenode daemon to get the location of the data blocks of the file it wants to read. The namenode daemon returns the list of addresses of the datanodes for the data blocks. For any read operation, HDFS tries to return the node with the data block that is closest to the client. Here, closest refers to network proximity between the datanode daemon and the client. Once the client has the list, it connects the closest datanode daemon and starts reading the data block using a stream. After the block is read completely, the connection to datanode is terminated and the datanode daemon that hosts the next block in the sequence is identified and the data block is streamed. This goes on until the last data block for that file is read. The following diagram shows the read operation of a file in HDFS: Understanding the namenode UI Hadoop provides web interfaces for each of its services. The namenode UI or the namenode web interface is used to monitor the status of the namenode and can be accessed using the following URL: http://<namenode-server>:50070/ The namenode UI has the following sections: Overview: The general information section provides basic information of the namenode with options to browse the filesystem and the namenode logs. The following is the screenshot of the Overview section of the namenode UI: The Cluster ID parameter displays the identification number of the cluster. This number is same across all the nodes within the cluster. A block pool is a set of blocks that belong to a single namespace. The Block Pool Id parameter is used to segregate the block pools in case there are multiple namespaces configured when using HDFS federation. In HDFS federation, multiple namenodes are configured to scale the name service horizontally. These namenodes are configured to share datanodes amongst themselves. We will be exploring HDFS federation in detail a bit later. Summary: The following is the screenshot of the cluster's summary section from the namenode web interface: Under the Summary section, the first parameter is related to the security configuration of the cluster. If Kerberos (the authorization and authentication system used in Hadoop) is configured, the parameter will show as Security is on. If Kerberos is not configured, the parameter will show as Security is off. The next parameter displays information related to files and blocks in the cluster. Along with this information, the heap and non-heap memory utilization is also displayed. The other parameters displayed in the Summary section are as follows: Configured Capacity: This displays the total capacity (storage space) of HDFS. DFS Used: This displays the total space used in HDFS. Non DFS Used: This displays the amount of space used by other files that are not part of HDFS. This is the space used by the operating system and other files. DFS Remaining: This displays the total space remaining in HDFS. DFS Used%: This displays the total HDFS space utilization shown as percentage. DFS Remaining%: This displays the total HDFS space remaining shown as percentage. Block Pool Used: This displays the total space utilized by the current namespace. Block Pool Used%: This displays the total space utilized by the current namespace shown as percentage. As you can see in the preceding screenshot, in this case, the value matches that of the DFS Used% parameter. This is because there is only one namespace (one namenode) and HDFS is not federated. DataNodes usages% (Min, Median, Max, stdDev): This displays the usages across all datanodes in the cluster. This helps administrators identify unbalanced nodes, which may occur when data is not uniformly placed across the datanodes. Administrators have the option to rebalance the datanodes using a balancer. Live Nodes: This link displays all the datanodes in the cluster as shown in the following screenshot: Dead Nodes: This link displays all the datanodes that are currently in a dead state in the cluster. A dead state for a datanode daemon is when the datanode daemon is not running or has not sent a heartbeat message to the namenode daemon. Datanodes are unable to send heartbeats if there exists a network connection issue between the machines that host the datanode and namenode daemons. Excessive swapping on the datanode machine causes the machine to become unresponsive, which also prevents the datanode daemon from sending heartbeats. Decommissioning Nodes: This link lists all the datanodes that are being decommissioned. Number of Under-Replicated Blocks: This represents the number of blocks that have not replicated as per the replication factor configured in the hdfs-site.xml file. Namenode Journal Status: The journal status provides location information of the fsimage file and the state of the edits logfile. The following screenshot shows the NameNode Journal Status section: NameNode Storage: The namenode storage table provides the location of the fsimage file along with the type of the location. In this case, it is IMAGE_AND_EDITS, which means the same location is used to store the fsimage file as well as the edits logfile. The other types of locations are IMAGE, which stores only the fsimage file and EDITS, which stores only the edits logfile. The following screenshot shows the NameNode Storage information:
Read more
  • 0
  • 0
  • 2382

article-image-adding-geolocation-trigger-salesforce-account-object
Packt
16 May 2014
8 min read
Save for later

Adding a Geolocation Trigger to the Salesforce Account Object

Packt
16 May 2014
8 min read
(For more resources related to this topic, see here.) Obtaining the Google API key First, you need to obtain an API key for the Google Geocoding API: Visit https://code.google.com/apis/console and sign in with your Google account (assuming you already have one). Click on the Create Project button. Enter My Salesforce Account Project for the Project name. Accept the default value for the Project ID. Click on Create. Click on APIs & auth from the left-hand navigation bar. Set the Geocoding API to ON. Select Credentials and click on CREATE NEW KEY. Click on the Browser Key button. Click on Create to generate the key. Make a note of the API key. Adding a Salesforce remote site Now, we need to add a Salesforce remote site for the Google Maps API: Navigate to Setup | Security Controls | Remote Site Settings. Click on the New Remote Site button. Enter Google_Maps_API for the Remote Site Name. Enter https://maps.googleapis.com for the Remote Site URL. Ensure that the Active checkbox is checked. Click on Save. Your remote site detail should resemble the following screenshot: Adding the Location custom field to Account Next, we need to add a Location field to the Account object: Navigate to Setup | Customize | Accounts | Fields. Click on the New button in the Custom Fields & Relationships section. Select Geolocation for the Data Type. Click on Next. Enter Location for the Field Label. The Field Name should also default to Location. Select Decimal for the Latitude and Longitude Display Notation. Enter 7 for the Decimal Places. Click on Next. Click on Next to accept the defaults for Field-Level Security. Click on Save to add the field to all account related page layouts. Adding the Apex Utility Class Next, we need an Apex utility class to geocode an address using the Google Geocoding API: Navigate to Setup | Develop | Apex Classes. All of the Apex classes for your organization will be displayed. Click on Developer Console. Navigate to File | New | Apex Class. Enter AccountGeocodeAddress for the Class Name and click on OK. Enter the following code into the Apex Code Editor in your Developer Console window: // static variable to determine if geocoding has already occurred private static Boolean geocodingCalled = false; // wrapper method to prevent calling future methods from an existing future context public static void DoAddressGeocode(id accountId) {   if (geocodingCalled || System.isFuture()) {     System.debug(LoggingLevel.WARN, '***Address Geocoding Future Method Already Called - Aborting...');     return;   }   // if not being called from future context, geocode the address   geocodingCalled = true;   geocodeAddress(accountId); } The AccountGeocodeAddress method and public static variable geocodingCalled protect us from a potential error where a future method may be called from within a future method that is already executing. If this isn't the case, we call the geocodeAddress method that is defined next. Enter the following code into the Apex Code Editor in your Developer Console window: // we need a future method to call Google Geocoding API from Salesforce @future (callout=true) static private void geocodeAddress(id accountId) {   // Key for Google Maps Geocoding API   String geocodingKey = '[Your API Key here]';   // get the passed in address   Account geoAccount = [SELECT BillingStreet, BillingCity, BillingState, BillingCountry, BillingPostalCode     FROM Account     WHERE id = :accountId];       // check that we have enough information to geocode the address   if ((geoAccount.BillingStreet == null) || (geoAccount.BillingCity == null)) {     System.debug(LoggingLevel.WARN, 'Insufficient Data to Geocode Address');     return;   }   // create a string for the address to pass to Google Geocoding API   String geoAddress = '';   if (geoAccount.BillingStreet != null)     geoAddress += geoAccount.BillingStreet + ', ';   if (geoAccount.BillingCity != null)     geoAddress += geoAccount.BillingCity + ', ';   if (geoAccount.BillingState != null)     geoAddress += geoAccount.BillingState + ', ';   if (geoAccount.BillingCountry != null)     geoAddress += geoAccount.BillingCountry + ', ';   if (geoAccount.BillingPostalCode != null)     geoAddress += geoAccount.BillingPostalCode;     // encode the string so we can pass it as part of URL   geoAddress = EncodingUtil.urlEncode(geoAddress, 'UTF-8');   // build and make the callout to the Geocoding API   Http http = new Http();   HttpRequest request = new HttpRequest();   request.setEndpoint('https://maps.googleapis.com/maps/api/geocode/json?address='     + geoAddress + '&key=' + geocodingKey     + '&sensor=false');   request.setMethod('GET');   request.setTimeout(60000);   try {     // make the http callout     HttpResponse response = http.send(request);     // parse JSON to extract co-ordinates     JSONParser responseParser = JSON.createParser(response.getBody());     // initialize co-ordinates     double latitude = null;     double longitude = null;     while (responseParser.nextToken() != null) {       if ((responseParser.getCurrentToken() == JSONToken.FIELD_NAME) &&       (responseParser.getText() == 'location')) {         responseParser.nextToken();         while (responseParser.nextToken() != JSONToken.END_OBJECT) {           String locationText = responseParser.getText();           responseParser.nextToken();           if (locationText == 'lat')             latitude = responseParser.getDoubleValue();           else if (locationText == 'lng')             longitude = responseParser.getDoubleValue();         }       }     }     // update co-ordinates on address if we get them back     if (latitude != null) {       geoAccount.Location__Latitude__s = latitude;       geoAccount.Location__Longitude__s = longitude;       update geoAccount;     }   } catch (Exception e) {     System.debug(LoggingLevel.ERROR, 'Error Geocoding Address - ' + e.getMessage());   } } Insert your Google API key in the following line of code: String geocodingKey = '[Your API Key here]'; Navigate to File | Save. Adding the Apex Trigger Finally, we need to implement an Apex trigger class to geocode the Billing Address when an Account is added or updated Navigate to Setup | Develop | Apex Triggers. All of the Apex triggers for your organization will be displayed. Click on Developer Console. Navigate to File | New | Apex Trigger in the Developer Console. Enter geocodeAccountAddress in the Name field. Select Account in the Objects dropdown list and click on Submit. Enter the following code into the Apex Code Editor in your Developer Console window: trigger geocodeAccountAddress on Account (after insert, after update) {       // bulkify trigger in case of multiple accounts   for (Account account : trigger.new) {       // check if Billing Address has been updated     Boolean addressChangedFlag = false;     if (Trigger.isUpdate) {       Account oldAccount = Trigger.oldMap.get(account.Id);       if ((account.BillingStreet != oldAccount.BillingStreet) ||       (account.BillingCity != oldAccount.BillingStreet) ||         (account.BillingCountry != oldAccount.BillingCountry) ||         (account.BillingPostalCode != oldAccount.BillingPostalCode)) {           addressChangedFlag = true;           System.debug(LoggingLevel.DEBUG, '***Address changed for - ' + oldAccount.Name);       }     }     // if address is null or has been changed, geocode it     if ((account.Location__Latitude__s == null) || (addressChangedFlag == true)) {       System.debug(LoggingLevel.DEBUG, '***Geocoding Account - ' + account.Name);       AccountGeocodeAddress.DoAddressGeocode(account.id);     }   } } Navigate to File | Save. The after insert / after update account trigger itself is relatively simple. If the Location field is blank, or the Billing Address has been updated, a call is made to the AccountGeocodeAddress.DoAddressGeocode method to geocode the address against the Google Maps Geocoding API. Summary Congratulations, you have now completed the Geolocation trigger for your Salesforce Account Object. With this, we can calculate distances between two objects in Salesforce or search for accounts/contacts within a certain radius. Resources for Article: Further resources on this subject: Learning to Fly with Force.com [Article] Salesforce CRM Functions [Article] Force.com: Data Management [Article]
Read more
  • 0
  • 2
  • 9885
article-image-using-openstack-swift
Packt
13 May 2014
4 min read
Save for later

Using OpenStack Swift

Packt
13 May 2014
4 min read
(For more resources related to this topic, see here.) Installing the clients This section talks about installing the cURL command line tool. cURL – It is a command line tool which can be used to transfer data using various protocols. We install cURL using the following command $ apt-get install curl OpenStack Swift Client CLI – This tool is installed by the following command. $ apt-get install python-swiftclient REST API Client – To access OpenStack Swift services via REST API, we can use third party tools like Fiddler web debugger which supports REST architecture. Creating Token by using Authentication The first step in order to access containers or objects is to authenticate the user by sending a request to the authentication service and get a valid token that can then be used in subsequent commands to perform various operations as follows: curl -X POST -i https://auth.lts2.evault.com/v2.0/Tokens -H 'Content-type: application/json' -d '{"auth":{"passwordCredentials":{"username":"user","password":"password"},"tenantName":"admin"}}' The token that is generated is given below. It has been truncated for better readability. token = MIIGIwYJKoZIhvcNAQcCoIIGFDCCBhACAQExCTAHBgUrDgMCGjC CBHkGCSqGSIb3DQEHAaCCBGoEggRme…yJhY2Nlc3MiOiB7InRva2VuIjoge yJpc3N1ZWRfYXQiOiAiMjAxMy0xMS0yNlQwNjoxODo0Mi4zNTA0NTciLCU+ KNYN20G7KJO05bXbbpSAWw+5Vfl8zl6JqAKKWENTrlKBvsFzO-peLBwcKZX TpfJkJxqK7Vpzc-NIygSwPWjODs--0WTes+CyoRD EVault LTS2 authentication The EVault LTS2 OpenStack Swift cluster provides its own private authentication service which returns back the token. This generated token will be passed as the token parameter in subsequent commands. Displaying meta-data information for Account, Container, Object This section describes how we can obtain information about the account, container or object. Using OpenStack Swift Client CLI The OpenStack Swift client CLI stat command is used to get information about the account, container or object. The name of the container should be provided after the stat command for getting container information. The name of the container and object should be provided after the stat command for getting object information. Make the following request to display the account status. # swift --os-auth-token=token --os-storage-url=https://storage.lts2.evault.com/v1/26cef4782cca4e5aabbb9497b8c1ee1b stat Where token is the generated token as described in the previous section and 26cef4782cca4e5aabbb9497b8c1ee1b is the account name. The response shows the information about the account. Account: 26cef4782cca4e5aabbb9497b8c1ee1b Containers: 2 Objects: 6 Bytes: 17 Accept-Ranges: bytes Server: nginx/1.4.1 Using cURL The following shows how to obtain the same information using cURL. It shows that the account contains 2 containers and 1243 objects. Make the following request: curl -X HEAD -ihttps://storage.lts2.evault.com/v1/26cef4782cca4e5aabbb9497b8c1ee1b -H 'X-Auth-Token: token' -H 'Content-type: application/json' The response is as follows: HTTP/1.1 204 No Content Server: nginx/1.4.1 Date: Wed, 04 Dec 2013 06:53:13 GMT Content-Type: text/html; charset=UTF-8 Content-Length: 0 X-Account-Bytes-Used: 3439364822 X-Account-Container-Count: 2 X-Account-Object-Count: 6 Using REST API The same information can be obtained using the following REST API method. Make the following request: Method : HEAD URL: https://storage.lts2.evault.com/v1/26cef4782cca4e5aabbb9497b8c1ee1bHeader : X-Auth-Token: token Data : No data The response is as follows:HTTP/1.1 204 No Content Server: nginx/1.4.1 Date: Wed, 04 Dec 2013 06:47:17 GMT Content-Type: text/html; charset=UTF-8 Content-Length: 0 X-Account-Bytes-Used: 3439364822 X-Account-Container-Count: 2 X-Account-Object-Count: 6 Listing Containers This section describes how to obtain information about the containers present in an account. Using OpenStack Swift Client CLI Make the following request: swift --os-auth-token=token --os-storage-url= https://storage.lts2.evault.com/v1/26cef4782cca4e5aabbb9497b8c1ee1b list The response is as follows: cities countries Using cURL The following shows how to obtain the same information using cURL. It shows that the account contains 2 containers and 1243 objects. Make the following request: curl -X GET –i https://storage.lts2.evault.com/v1/26cef4782cca4e5aabbb9497b8c1ee1b -H 'X-Auth_token: token' The response is as follows: HTTP/1.1 200 OK X-Account-Container-Count: 2 X-Account-Object-Count: 6 cities countries Using REST API Make the following request: Method : GET URL : https://storage.lts2.evault.com/v1/26cef4782cca4e5aabbb9497b8c1ee1b Header : X-Auth-Token: token Data : No data The response is as follows: X-Account-Container-Count: 2 X-Account-Object-Count: 6 cities countries Summary This article has thus explained various mechanisms that are available to access OpenStack Swift and how by using these mechanisms we will be able to authenticate accounts and list containers. Resources for Article: Further resources on this subject: Securing vCloud Using the vCloud Networking and Security App Firewall [Article] Introduction to Cloud Computing with Microsoft Azure [Article] Troubleshooting in OpenStack Cloud Computing [Article]
Read more
  • 0
  • 0
  • 2148

article-image-building-mobile-apps
Packt
21 Apr 2014
6 min read
Save for later

Building Mobile Apps

Packt
21 Apr 2014
6 min read
(For more resources related to this topic, see here.) As mobile apps get closer to becoming the de-facto channel to do business on the move, more and more software vendors are providing easy to use mobile app development platforms for developers to build powerful HTML5, CSS, and JavaScript based apps. Most of these mobile app development platforms provide the ability to build native, web, and hybrid apps. There are several very feature rich and popular mobile app development toolkits available in the market today. Some of them worth mentioning are: Appery (http://appery.io) AppBuilder (http://www.telerik.com/appbuilder) Phonegap (http://phonegap.com/) Appmachine (http://www.appmachine.com/) AppMakr (http://www.appmakr.com/) (AppMakr is currently not starting new apps on their existing legacy platform. Any customers with existing apps still have full access to the editor and their apps.) Codiqa (https://codiqa.com) Conduit (http://mobile.conduit.com/) Apache Cordova (http://cordova.apache.org/) And there are more. The list is only a partial list of the amazing tools currently in the market for building and deploying mobile apps quickly. The Heroku platform integrates with the Appery.io (http://appery.io) mobile app development platform to provide a seamless app development experience. With the Appery.io mobile app development platform, the process of developing a mobile app is very straightforward. You build the user interface (UI) of your app using drag-and-drop from an available palette. The palette contains a rich set of user interface controls. Create the navigation flow between the different screens of the app, and link the actions to be taken when certain events such as clicking a button. Voila! You are done. You save the app and test it there using the Test menu option. Once you are done with testing the app, you can host the app using Appery's own hosting service or the Heroku hosting service. Mobile app development was never this easy. Introducing Appery.io Appery.io (http://www.appery.io) is a cloud-based mobile app development platform. With Appery.io, you can build powerful mobile apps leveraging the easy to use drag-and-drop tools combined with the ability to use client side JavaScript to provide custom functionality. Appery.io enables you to create real world mobile apps using built-in support for backend data stores, push notifications, server code besides plugins to consume third-party REST APIs and help you integrate with a plethora of external services. Appery.io is an innovative and intuitive way to develop mobile apps for any device, be it iOS, Windows or Android. Appery.io takes enterprise level data integration to the next level by exposing your enterprise data to mobile apps in a secure and straightforward way. It uses Exadel's (Appery.io's parent company) RESTExpress API to enable sharing your enterprise data with mobile apps through a REST interface. Appery.io's mobile UI development framework provides a rich toolkit to design the UI using many types of visual components required for the mobile apps including google maps, Vimeo and Youtube integration. You can build really powerful mobile apps and deploy them effortlessly using drag and drop functionality inside the Appery.io app builder. What is of particular interest to Heroku developers is Appery.io's integration with mobile backend services with option to deploy on the Heroku platform with the click of a button. This is a powerful feature where in you do not need to install any software on your local machine and can build and deploy real world mobile apps on cloud based platforms such as Appery.io and Heroku r. In this section, we create a simple mobile app and deploy it on Heroku. In doing so, we will also learn: How to create a mobile UI form How to configure your backend services (REST or database) How to integrate your UI with backend services How to deploy the app to Heroku How to test your mobile app We will also review the salient features of the Appery.io mobile app development platform and focus on the ease of development of these apps and how one could easily leverage web services to deploy apps and consume data from any system. Getting Appery.io The cool thing about Appery.io is that it is a cloud-based mobile app development toolkit and can be accessed from any popular web browser. To get started, create an account at http://appery.io and you are all set. You can sign up for a basic starter version which provides the ability to develop 1 app per account and go all the way up to the paid Premium and Enterprise grade subscriptions. Introducing the Appery.io app builder The Appery.io app builder is a cloud based mobile application development kit that can be used to build mobile apps for any platform. The Appery.io app builder consists of intuitive tooling and a rich controls palette to help developers drag and drop controls on to the device and design robust mobile apps. The Appery.io app builder has the following sections: Device layout section: This section contains the mock layout of the device onto which the developer can drag-and-drop visual controls to create a user interface. Palette: Contains a rich list of visual controls like buttons, text boxes, Google Map controls and more that developers can use to build the user experience. Project explorer: This section consists of many things including project files, application level settings/configuration, available themes for the device, custom components, available CSS and JavaScript code, templates, pop-ups and one of the key elements— the available backend services. Key menu options: Save and Test for the app being designed. Page properties: This section consists of the design level properties for the page being designed. Modifying these properties changes the user interface labels or the underlying properties of the page elements. Events: This is another very important section of the app builder that contains the event to action association for the various elements of the page. For example, it can contain the action to be taken when a click event happens on a button on this page. The following Appery.io app builder screenshot highlights the various sections of the rich toolkit available for mobile app developers to build apps quickly: Creating your first Appery.io mobile app Building a mobile app is quite straightforward using Appery.io's feature rich app development platform. To get started, create an Appery.io account at http://appery.io and login using valid credentials: Click on the Create new app link on the left section of your Appery.io account page: Enter the new app name for example, herokumobile app and click on Create: Enter the name of the first/launch page of your mobile app and click on Create page: This creates the new Appery.io app and points the user to the Appery.io app builder to design the start page of the new mobile app.
Read more
  • 0
  • 0
  • 2887