NOT Another GDPR Post – CTO Reviews Quarterly Performance
If you are a tech person like me, you‘ll probably share the sentiment – soft stuff like writing a blog is being put into the darkest schedule corner with a hope that it will just pass. Well, it‘s not, and it‘s not fair to You, our customers and partners, so I promise to deliver quarterly reports on what we‘ve been up, what‘s ahead and perhaps share some juicy details.
What we‘ve been up to
We‘ve ended 2016 heavily scarred after the infamous Black Friday incident, where we hit a snag with our storage subsystem in Frankfurt. Although no customer data was lost, the time it took to bring everything back to normal took a while. Morale was low, customers were angry, however, we brought ourselves together and moved on.
Early in 2017, we‘ve contacted 42on B.V., a consultancy company based in The Netherlands, and agreed to have an intense two-day brainstorm session on few things: how to avoid meltdowns in the future and how to make it faster. We‘ve designed a completely new cluster, got rid of underperforming Ceph caching layer, bought new storage boxes and went to DC for installation.
The results were amazing in comparison to previous design: no longer we had unpredictable I/O latency (removed tens of thousands of code lines each I/O needed to traverse in caching layer logic), loads were significantly reduced (more and faster SAS disks, better CPUs), much higher IOps due to NVMe backing and, in a way, lower power consumption due to a change from BASE-T to fiber channel-based networking.
Without waiting, we‘ve performed the same upgrade in Chicago. Then later same year in Johannesburg.
Network infrastructure has been growing organically over the past ten years, so it‘s no wonder that we‘ve found ourselves surrounded by semi-random switches, routers, transceivers and cables. We would bump into unexplainable packet drops, latency spikes, protocol compatibility issues and so on. This did not work well with our goal of automating router/switch configuration changes either, since developers would need to write yet another compatibility layer.
We‘ve redesigned the data center network topology and, backed with past experience and knowing what we want, we‘ve decided to standartise on Juniper hardware.
As of writing this, Frankfurt, Chicago and Johannesburg are all upgraded to the new spec.
In addition to hardware upgrades, we‘ve considerably expanded our connectivity with local Internet exchanges and Tier 1 network providers, which brings not only higher availability, but also better price per Mbps to the end user.
Data Center Upgrades
Looking at perspective, it feels like something out of science fiction, however in between everything we‘ve also built and moved all services in two locations: São Paulo, Brazil and Johannesburg, South Africa.
Both were „legacy“ locations where only parts of infrastructure were owned by us. Not having direct contracts with Tier 1 upstream providers and not being able to offer a full set of Cloud services hampered our growth.
With upgrades complete, we have launched our standard suite of services, not previously available both locations: Enterprise Cloud, Virtuozzo, IP Transit and Colocation. We also managed to cut end-user costs on bandwidth and considerably improve compute and storage performance due to the hardware upgrade.
Having all that shiny metal is not much good without automation. Our in-house development team has been busy as ever. Here is a quick glimpse of their quarterly achievements.
Virtuozzo 7 Automation
For some time we were running an open-source cousin of Virtuozzo ® – OpenVZ – in production. It was good, but just as our focus shifted towards business clients and partners, so did the requirements for the future of this particular virtualization platform.
The top requests from our customers were newer kernel, faster backups, better OS support and stable tun(nel) module, while our engineering team wanted more efficient compute utilization, automated datastore management and improved management panel.
Our dev team designed and developed a production ready automation system in a span of 8 months. The system is written in GoLang and we heavily utilize AMQP for global message queueing and processing.
This enabled us to provide not only a full set of services, like capacity monitoring, statistics collection, smart placement, storage and backup management, but also better management capabilities for our system admin team. Most importantly, though, we’ve exposed all of this functionality via publicly available API.
Our flagship hypervisor-based VM product has been in production for two years now and it quickly became a de-facto standard for long-term infrastructure deployments. The development of this platform is our strategic goal, thus we have been constantly releasing improvements:
– Template Builder, keeping our Operating System templates up to date with the latest security patches and improvements.
– We have implemented CPU and Memory hot-plugging, allowing our customers to upgrade these resources on supported OS’es on the fly, without system restart.
– Custom ISO upload, allowing you to install custom Operating System image of choice and have the maximum flexibility, or simply install an application.
– Bandwidth consumption monitoring and alerting, which helps to keep track of consumable resource.
– Lastly, we’ve updated the backend logic so that VMs are placed in their dedicated NUMA nodes for improved VM performance and better resource utilization.
One of the strategic goals of our development has been exposing as much as possible of our infrastructure platform functionality programmatically, via API. Being technical company at our core, we deal with APIs every day and hour: website checkout talks to a billing system, billing system talks to automation platform, automation talks to machines and network gear, monitoring system talks to chat system, etc. All of this is possible due to set of rules, definitions and protocols allowing developers to build applications or any kind of integration between otherwise unrelated systems.
Host1Plus API allows anyone with an account with us to grab an API token and control the full lifecycle of Enterprise Cloud or Virtuozzo environment. In addition, we have added reverse DNS control and IP management functionality. Which means you can integrate virtual machine provisioning and management along with your existing virtualized infrastructure workflow.
We have standardized on RESTful principles while developing our API endpoint, please give it a try and let us know if you’d like to see it improved.
WHMCS Partner Module
Having an API allowed us to take our Partner Program to the next level. We have released an open-source module for a popular WHMCS web hosting automation platform, which with a few configuration steps will allow you to enable Cloud and Virtuozzo services for your customers.
You can find it on GitHub and it comes with documentation along with our developer support in case you need assistance or have a feature request. Currently it’s tailored for WHMCS7/PHP7.
We strive to improve the user experience when dealing with our services, accessible via Client Area. Big things are coming there, but meanwhile:
– If you have Premium Dedicated servers with us, you probably noticed the server dashboard now has a similar look and feel to the rest of Cloud services: an informative dashboard with specifications and event log, power and console control buttons, statistics tab.
– Virtuozzo dashboard has been improved to provide more details on resource utilization, has a much more effective out-of-band management console, improved event log and fast scheduled backups section.
– Interaction with support organization has been improved and we are also showing your account manager information.
– By popular request, we’ve added bulk reverse DNS management (with IPv6 support!).
It is almost mid-year and we are well into a number of projects set to be delivered this year. Here is a quick glimpse of what is to come short to medium term.
– We have just launched Virtuozzo, improved our backup system and heavily improved networking in LAX (Los Angeles).
– Backups in SAO (São Paulo) are on their way. I know it’s overdue, but we believe now we have fixed the delivery and import to Brazil so it will be fast now. We will also re-enable local Internet Exchange in São Paulo for better connectivity and, hopefully, better Mbps pricing for you.
– We are in the midst of launching an improved traffic accounting system in all of our locations. We’ll be able to track per upstream bandwidth consumption with a much finer granularity, which will allow us to provide bandwidth consumption information in a more transparent way with the possibility for you to choose what kind of bandwidth you need more (e.g. more local traffic vs expensive international).
– Deploying hardware firewalls in major locations, which we’ll virtualize to provide hardware firewall as a service, configurable via Client Area.
– Removing all single point of failures from our infrastructure. It’s a daunting task, but we are committed to get rid of them and are well into the process already.
– Service status page: we’ll launch a site (and perhaps provide an API endpoint) to check/query our service health, like Billing, API, virtualization, storage and other systems.
– Virtual Private Cloud: much awaited product, which will allow you to take control of all infrastructure aspects and have a better inter-VM communication.
– Instant Bare-metal over API: we are about to launch an Instant Bare-metal product in multiple locations around the globe, which you’ll be able to order via API in a matter of minutes. Essentially, a simplicity of Cloud on shared-nothing boxes. Stay tuned for announcement very soon.
– Premium Dedicated Servers management console proxy: we’ll make it easier to grab that out-of-band IPMI console, without a need to open an additional page.
– Virtual Private Cloud interface and management: although we’ll offer a Managed VPC with own console, we want to integrate it with the rest of our products in Client Area.
– BYO-IP process simplification: for the times when you want to bring your own IPs to be used on our infrastructure, we’ll simplify the process so there’s as little human interaction as possible.
– Host1Plus SDK. In order to quickly get started with our API, we’ll provide libraries in popular programming languages.
– Client Area rewrite: we are getting ready to switch from legacy APIs and start consuming our own API for the Client Area interface. This will allow us to enable 2-factor authentication, have multiple user levels/roles and other long-waited features. The design consistency and UX is also top priority.
– Enterprise Cloud improvements: introducing storage tiers and system backups.
Please join our community at feature-voting page and let us know if you’d like to see a product improvement. We might add it to our list! What would you like to read about in the next quarterly report?
Oh… and GDPR – please be assured we are working very hard on making sure your (personal) data is secure. There will be a few announcements over the coming weeks on how and why it is handled by us to provide the best service. I myself will write a section with the next quarterly blog update on challenges we faced and solutions we put in place for the better good.