Yesterday, on our Continuous Discussions (#c9d9) video podcast, we talked about ChatOps. During the live discussion, both myself and my colleague Anders Wallgren kept mentioning the EXCELLENT talk given by our customer – Daniel Perez from HPE – at the recent DevOps Enterprise Summit (DOES16) in San Francisco.
During his talk, Daniel walked us through HPE’s DevOps journey and the proliferation of ChatOps within their organization. Daniel shared the decisions that lead them to invest in ChatOps, the technologies that they integrate with, and the key use cases and capabilities of their Chatbot, called Hammer. He further shared best practices and lessons learned for successfully implementing ChatOps – in a way that scales, that is helpful, and that is also fun!
Then (being the rock star that he is), Daniel did one better: he shared on GitHub their code for Hubot (Chatbot) integrations with ElectricFlow and all related tools – available as a Docker container.
To accelerate your adoption of ChatOps and leverage HPE’s various integrations (more on that below), you can use the GitHub code found here, and take it for a spin using the free Community edition of ElectircFlow.
What is ChatOps?
ChatOps is a collaboration and conversation-driven development model that emphasizes transparent and collaborative workflow. While ChatOps is not a new thing, new technologies have made ChatOps a viable strategy to improve your delivery capabilities.
Using chat clients and chatbots, ChatOps focuses on connecting everything throughout the software delivery cycle – people, tools, environments, processes, automation, testing, deployments, etc. – into a persistent chat channel. The idea is that by keeping communications lines open, and having a central point of conversation and one pane of glass shared between all collaborators/stakeholders, we improve the flow of information, accelerate feedback loops, and streamlines the delivery pipeline.
Your chat tools can do more than give you a log of what is being done, and when. Chatbots can do the work for you- 24/7!. With “conversation-driven” development and operations, ChatOps helps accelerate application releases, shift-left testing, monitoring and diagnosing, and can even be used for self-healing of Production failures.
An example from HPE: their open-source monitoring tool detects an issue in one of the environments. It automatically, using Chatbots, opens up a ticket, talks in the relevant chat room, pulling in the relevent graphs and metrics from the different tools, and then brings the key people into the conversation to collaborate on what needs to be done. In this case, ChatOps provides a one-stop-shop where you can pull in data, quick replies, and quick automations to enable your developers to resolve the issue. In other scenarios, the chatbot is configured for self-healing, and can automatically trigger a rollback, a new build, or a new deployment in order to remedy an issue in Production.
With both relying on collaboration, shared visibility, and accelerated feedback loops – it’s no surprise that ChatOps has become a key enabler for DevOps and a proven pattern for accelerating your transformation.
Best Practices for Chat Ops in the Enterprise:
Here are some takeaways for ChatOps in the enterprise that Daniel shared during his talk:
What can the chatbot do?
HPE uses Hubot for ChatOps. Hubot was developed by GitHub (the folks who first coined the phrase ChatOps) and is an open source bot that can integrate in your enterprise chat solution.
HPE’s Hubot, named “Hammer”. Some of his key capabilities and integrations are:
- Performing daily data look-ups and grabbing of metrics from Nagios or Grafana
- Graphing statsd performance data from ElectricFlow, SCM, GitHub Enterprise, and more.
- Run builds, deployments or other automation workflows or pipelines
- Kick-off selenium tests and other test-suits for various applications
- Provide application stats
- Alias commands
- Monitor general application health and integrations with SCM tools for ananlytics, environment discovery, etc.
- Perform self-healing – restart services, status checks, rollbacks/rollforward and repairs
- Run Chaos-Monkey to break things
- It can even tell jokes!
HPE uses GitHub to store their Hubot source code and continue to enhance its capabilities, and ElectricFlow to deploy the actual bot. Hubot also integrated with ElectricFlow to trigger processes and pipelines, or report on executed tasks.
Tips to keep in mind:
While Hammer undoubtedly improved HP’s development pipeline, Perez warns that there are a few “gotchas” to watch out for:
- Pick the right tool that best fits your use case: HPE has been testing a handful of tools before settling on Hubot, mostly for the Security capabilities.
- Automate things that make sense: You really shouldn’t be automating everything – certain things don’t really have to be that automated, or sometimes the things could be over-engineered as well, to the point where implementing such automation is not worth the effort.
- Keep the integration simple and reuse code as much as possible: Start small, and iterate from there. When they first started with ChatOps, the team worked to write many one-off type solutions. They worked great in the short term, but it quickly became very complex and hard to maintain. These solutions didn’t work together, or they had missing or too many components, and they were breaking. Keep it simple as generic as possible – focusing on long-term solutions that will work for a variety of uses.
- Persistent data is a must: HPE uses Redis Brain to store user info, chat history, key/value pairs, etc., and Mongo for script data.
- If you’re not integrating with your tools for automated builds and test/deploy pipelines – you’re missing out!
- Pull in the right data, and be conscious of signal-to-noise ratio: Make sure that you’ve got the right data in the chat room. You don’t want overflow your chat room with too much data as it can confuse developers and make finding the right information difficult. You want to make sure that the data is relevant.
- Avoid chatbot as a single point of failure: So that if your Chatops dies, you can still continue working. For example, if your workflow has your bot pulls data from one of your monitoring tools and writes it in a database – what do you do if the bot is not available? is there another way to pull the data?
- Make sure your chatbot scripts are tool-agnostic: Tools like Flowdock, Slack, and HipChat make that possible and will help future-proof your scripts so you don’t run into issues if your chat tool changes for any reason.
- Security is paramount: As with most enterprises, security is always top of mind for HPE. Fortunately, Hubot comes with many security features baked in, and its Role-based permissions, Hubot auth, with the ACL’s native in ElectricFlow were a perfect match. Hubot uses the express framework for offering a port for you to use when you’re writing your integrations and you can use that for basic authentication. You can also implement an Nginx proxy as well to secure the communication from the chatbot. Avoid using personal accounts for writing integrations and ensure that the automation account has the right scope of permissions.
- Properly onboard team members: to enable them to effectively use the tool and continue to contribute to it.
ChatOps has enabled HPE to develop at a much faster pace while simultaneously giving their teams increased visibility, accountability and control over the entire DevOps pipeline.
With a shared collaboration space, persistent chat, historic information, self service automation and a 24/7 chatbot at your service — ChatOps is an effective strategy for a successful DevOps adoption. Check out the resources below for more tips on ChatOps, or to start your journey:
Watch the video recording of Daniel’s DOES16 talk:
- HPE’s Hubot integrations code on GitHub
- Learn more about HPE’s own DevOps transformation journey – and their use of ChatOps for monitoring and self-healing – as shared at DevOps Enterprise Summit through the years.
- For more best practices for ChatOps and DevOps, check out this latest episode of #c9d9 podcast.
Latest posts by Sam Fell (see all)
- Continuous Discussions (#c9d9) Podcast, Episode 86: Human Factors - April 3, 2018
- DevOps Enterprise Summit 2017 top takeaway: Adoption rapidly moving beyond IT - November 22, 2017
- Continuous Discussions (#c9d9) Podcast, Episode 82: Gene Kim and the DOES17 Speakers #5 – The Deployment Age - October 31, 2017