DevOps has emerged as an essential philosophy, bridging the gap between software development and IT operations. It goes beyond just using DevOps tools. A successful DevOps team not only speeds up the delivery process but ensures quality and reliability. However, creating such a team requires a harmonious amalgamation of tools, culture, processes, collaboration, and more. What are the critical components of a top-notch DevOps team, and how can organizations integrate them for optimum results? As a part of this series, we had the pleasure of interviewing Dylan Etkin, co-founder and CEO of Sleuth.
Hi, Dylan – Can you share your backstory with us?
I started my career as a software engineer. I was fortunate to join a small startup, Atlassian, as their 20th employee. I spent 10 years at Atlassian learning a ton and watching the organization grow from 20 to 1200. I worked as one of the first 3 engineers on Jira, the first Jira Architect, and as the Engineering lead for Bitbucket, growing that product from 40K to 2M users.
I left Atlassian and joined a small start-up, Statuspage. One year later, Statuspage was acquired by Atlassian, and I found I was back in the mother ship again. After 3 years leading that team at Atlassian, I decided to strike out on my own, starting Sleuth to productize what has always been my passion: building efficient engineering teams.
I have always been fascinated by developer tools and helping teams be efficient and continuously learn.
Who do you credit for helping you achieve success?
Many people have helped me along the way. The older I’ve gotten, the more I’ve realized what a special network I’ve grown and how open they all are to helping when asked.
One example that stands out is John Kodumal, the co-founder and CTO of LaunchDarkly. John and I were peers while at Atlassian. When he went off to found LaunchDarkly, he was always open about his startup experience.
As I began to motivate myself to start Sleuth, he was supportive at every level. From idea sparring to open conversations about how to proceed to signing up as our first customer, John has always helped at every point of my journey.
Can you share with us three strengths, skills, or characteristics that helped you reach this point in your career? How can others actively develop these areas within themselves?
- Persistence. Skill and intelligence will only take you so far. Coming back to a problem over and over again is far more powerful. Persistence, even in the face of multiple failures, has defined success for me.
- Attention to detail. This is a characteristic that has always come naturally to me. As an engineer, it’s a vital skill that allows you to accomplish the tasks in front of you in a way that excels. In a startup environment, this skill is even more important. When guiding a product from vision to reality, attention to detail will make the difference between a mediocre product and something users love.
- Delegation. As much as you need to fully own your business, a startup very quickly turns into an exercise of hiring people who are smarter than you in many areas and embracing giving away your legos so others can succeed and push your business forward.
Which skills are you still trying to grow now?
I’m always working on many things, there’s never any shortage of skills to work on. Top of mind for me right now is the art of getting the best from your people. Everyone is different, and knowing how to provide an environment where each individual can do their best work is a challenge.
Let’s talk about having a successful DevOps team. What are the key goals a DevOps team might identify for a digital transformation journey?
DevOps teams are empowered to efficiently take work from concept to successful launch in production. Therefore, the number one goal for a DevOps team is to build the tooling, process, and culture that allows individual developers to quickly take work from concept to successful launch.
Goals that support achieving the number one goal are:
- Ability to define work in small batch sizes
- A fully automated CI/CD pipeline
- Observability and pre-production environments that act as safety nets so developers can confidently push changes to production
- A clear process for what to do when changes go wrong. Everyone knows how to detect failure and get the system back into a working state
- Buy in from the rest of the organization, PMs, Marketing, Support, Sales and Execs. Teams won’t be able to pull off a DevOps transformation without the support of the entire organization
Are there any challenges or common pitfalls that DevOps teams should consider?
There are too many to list. One that stands out is to recognize that practicing DevOps means you are pushing more responsibility onto the developer. This doesn’t come for free, and if you don’t invest in process and tooling, you run the risk of building a system that won’t leave any time for developers to actually get work done.
You also run the risk of building an environment where developers will burn out quickly. You must remember that people only have so much capacity, so you have to be mindful of what you put in place to support developers practicing DevOps.
How can effective collaboration and communication among team members enhance the productivity and success of a DevOps team, and what practices can facilitate this?
Communication and collaboration are required to practice DevOps. Some important practices include:
- Using an issue tracker for all work and including that context and information in commits and code reviews
- Code review with pull requests
- Providing a real-time chat solution like Slack for your team
- Team-based deployment notifications in Slack or your team ChatOps tool
- Slack-based approvals for promoting changes from pre-production environments to production
- Individual developer notifications when their changes ship and when they’ve breached failure conditions
- Team visibility into incidents and periods when deployments should wait
- Team demos with PMs and other stakeholders
- Daily standups
What role does CI/CD play in DevOps, and what are the best practices for implementing CI/CD pipelines to ensure a seamless and reliable software release process?
CI/CD is arguably the most important tooling component that supports a team practicing DevOps. The ideal is for teams to have a fully automated deployment pipeline that executes in a reasonable amount of time, so developers don’t have a long wait and can shepherd their changes into production. It’s important to include CI and test execution as part of this ideal pipeline. Running a robust test suite against your changes as a gate to deployment is your primary safety net to ensure good changes are shipped.
If teams are only looking to deploy once a week, then it is possible to have a deployment pipeline that isn’t fully automated. However, if your team is looking to deploy at least once a day, you must be fully automated, and deployment must be a non-event.
How does fostering a DevOps culture and mindset contribute to the overall success of a DevOps team, and what strategies can organizations use to promote this culture among their development and operations teams?
For developers:
- Weekly or bi-weekly sprint planning can help set the size of tasks and create a team agreement on how tasks will fit into this window
- Automated CI against the release branch and team-wide visibility into those results to help the team keep the release buildable
- Maintaining a “disturbed” role on a team where it’s clear whose job it is to keep the release branch buildable
- One of the best mechanisms for being able to ship incomplete work is adopting some form of feature flagging.
- It can also be highly effective to deploy every pull request one at a time. Adopting this pattern allows you to include reviewing the batch size of a change along with your pull request code review. If a batch is too large, other team members can ask a developer to break their change into multiple pull requests.
- Have enough metrics and an understanding of their norms to be able to verify a deploy
- Understand how to escalate and roll back if need be
- Are empowered to deploy their own changes to all the environments you maintain and can do so quickly and without failure
For PMs and designers:
- Include a PM in the planning process to give them an opportunity to express the needs of the customer in incremental deploys.
- Implement feature flags, allowing code to be rolled out but not yet exposed to customers. This helps PMs retain the flexibility they need while allowing developers to charge ahead.
- Have an agreed-upon mechanism for feedback and rework, such as an issue assigned directly to a developer. One strategy for solving the incomplete work issue is for a team to adopt six-week cycles.
For Engineering Managers:
- Implement processes that facilitate communication, such as recurring planning meetings, code reviews, and weekly demos.
- Commit to dedicating engineering time to keep the release branch builds passing. Keeping the code flowing also means paying into this new system. When the team identifies a bottleneck, the manager should be able to allow the team to remove it (e.g., fix CI tests that have a flakiness higher than 20%).
- Implement tools that help you continuously measure DORA metrics
For Execs:
- Surface the Accelerate metrics for your projects and make them available, with context, to your execs.
- Surface your uptime or equivalent for your applications.
- Provide your exec team with the same high-level information about large chunks of functionality that are shipping.
- If you are an executive, understand that change takes time. Be firm about holding your teams accountable for high-level goals but flexible about how they achieve them. Trust, but verify.
- Take the time to explain to your organization how you are delivering value to your customers. Explain how incremental delivery works, what things will be released in a big bang, and how this way of working can allow you to deliver value to customers faster.
What are the esssential components of a successful DevOps team?
1 . The team works together in a blameless culture.
Software development is a team sport. Incidents and mistakes are opportunities for teams to learn, not opportunities for blame. A blameless culture can extend past incidents. When setting team improvement goals a team can focus on improving outcomes rather than blaming individuals or processes for why things aren’t where they should be.
2 . Deployments are a non-event.
To ship in small increments and respond to incidents and customer feedback quickly, the deployment must be easy and not induce any fear or anxiety. If a team doesn’t trust its deployments, or if they take too long or involve manual steps, this will lead to fear and hesitation, which will not allow teams to adopt DevOps.
3 . Developers are individually empowered to take changes from concept through reliable launch in production. DevOps works when you shift the responsibility for the entire concept to launch the process into the hands of developers. This allows your team to break work into small batches and deliver value in small increments to the customer quickly.
4 . The team has safety nets, such as observability and failure measures, so they have the confidence to move quickly. The days of moving fast and breaking things are over. Now, you need to move fast reliably. To do so, you must have automated safety nets in place.
5 . The entire organization has bought into working in a DevOps flow.
What emerging trends do you foresee in the landscape of DevOps that could significantly impact digital transformation strategies in the future?
This may seem like an obvious answer right now, but AI tools are already transforming how developers work and will certainly have a large impact on how DevOps is practiced moving forward. AI will, at the very least, be used to detect and identify failure in ways that aren’t possible today.
If it lives up to all of its promises, then it will fully replace the complicated deployment and testing pipelines we have today.
Subscribe to The CTO Club's Newsletter for more insights, roundups, Q&As, and more!