Radical Agility at Scale
Radical Agility — Zalando’s approach to managing our tech organization — asserts that autonomy, mastery, purpose and trust create the optimal conditions for delivering innovative products quickly at scale. Our engineering teams enjoy end-to-end ownership of their work, which enables us to preserve our start-up spirit and ability to experiment in the face of explosive growth (+800 technologists and counting). They’re also responsible for all software provisioning and quality-related aspects — including design, implementation, code review, testing, continuous integration, and operation. Teams can challenge the purpose of their work, make technical decisions autonomously, and pick up new skills via their individualized Tours of Mastery.
About RESTful APIs, API First and API Quality
Zalando’s software architecture centers around decoupled microservices that provide functionality via RESTful APIs with a JSON payload. Small teams own and deploy these microservices, which are run in AWS teamaccounts. Microservices development begins with API definition outside the code and getting peer review feedback.
Our APIs most purely express what our systems do, and are therefore highly valuable business assets. With this in mind, we’ve adopted “API First” as one of our key engineering principles. API First encompasses a set of quality-related standards we encourage our teams to follow to ensure that our APIs:
- are easy to understand and learn
- are general and abstracted from specific implementation and use cases
- are robust and easy to use
- have a common look and feel
- follow a consistent RESTful style and syntax
- are consistent with other teams APIs and our global architecture
Ideally, all Zalando APIs should look like the same author created them.
Designing high-quality, long-lasting APIs has become even more critical for us since we started developing our new, open platform strategy, which transforms Zalando from an online shop into an expansive fashion platform. Our strategy emphasizes lots of public APIs used by our external business partners via third-party applications. Good API design is hard work, takes time and ideally involves ample code review. We can only evolve our APIs by providing backward compatibility with robust clients. We cannot afford to break our APIs, and must approach large-scale changes cautiously. API First helps to keep us on track.
Zalando’s API Guild
Not long after Radical Agility’s official adoption in March, some of us created an API Guild for teams to share their experiences and discuss how to ensure API quality. The API Guild, like all of our internal guilds, is an informal group of Zalando technologists that meets regularly to advance topics of interest. Members represent diverse teams across our organization. In its early days the API Guild attracted only a few members and focused on REST. Since then the Guild has grown significantly, with around 20 dedicated members who focus on:
- shared knowledge and best practices around API design and API implementation in our polyglot environment (Scala, Java, Clojure, Python, etc.)
- standards and guidelines
- quality assurance via API peer review feedback
Many members have gained experience in designing RESTful APIs, and use the Guild to share their knowledge or serve as team ambassadors. In addition to developing a RESTful API practices document, we’ve discussed examples of good API design and implementation, picked “APIs of the Month,” and organized RESTful API Coder Dojos, among other activities.
The Guild has been a valuable forum for us to increase awareness of great APIs, design techniques and best practices. Members meet bimonthly to discuss API design topics, make decisions on guidelines, and improve our documentation. All meetings, documents, reviews and chats are public and open for all engineers. If API issues come up during peer review or in discussions, they go on the Guild’s meeting agenda. If they can’t be clarified in a time-box, someone takes responsibility for further research and follow-up.
Avoiding the API Review Bottleneck
We want outside team peers to review all APIs, so we have adopted an open review procedure that’s as lightweight as possible. The API Guild is invited to all reviews; even more importantly, so are the teams that use the APIs. Here, the Guild is not acting as "approval board,” but as a sounding board and experienced review resource that helps us to avoid bottlenecks.
Despite our strides, API review still involves a lot of work. To ensure valuable Guild-generated feedback of all reviews, we encourage members to commit around 10 percent of their week to API design and best-practice sharing—emphasizing the Guild’s purpose and value, and asking them to record their contributions in their personal and/or team OKRs. We’ve also introduced a weekly stand-up that focuses specifically on review coordination and enable new hires to pair up with experienced members.
Status and Outlook
The API Guild has proven to be a valuable tool for sharing knowledge and best practices of API design and implementation. It supports our autonomous engineering teams in achieving overarching alignment and high quality of the APIs they own. With the Guild’s help we make RESTful API design, implementation and review go smoothly and steadily for us. The Guild’s work is especially important for Radical Agility and our emerging microservice landscape, which is driven by REST, SaaS, cloud, API First, and peer review.
With our increasing knowledge and experience, the API Guild will focus more and more on reviews to achieve overarching design consistency quality. For the time being, there is still much about REST that we have to learn—for example, supporting microservice and API discovery. Another challenging topic is HATEOAS; we don’t have specific recommendations for a Zalando standard yet, and expect this to emerge via experimentation.
To learn more about API engineering at Zalando, please also check out the more recent tech blog post Developing Zalando APIs and the InfoQ interview with Dr. Thomas Frauenstein How Zalando Delivers APIs (InfoQ).