Tech Lead notes from Software Craftsmanship London 2019

As part of working at Redgate I was lucky enough to go to Software Craftsmanship 2019. Here are some notes and key takeaways from the talks.

Aligning Product and Software Design, Sandro Mancuso

We struggle to find the time to do software design and architecture. If we have a project mindset, we only focus on the external customer goal without focussing on the means and investment.

Software products require continuous investment and the point of software design is to make it possible to evolve the business faster. We want to enable continuous delivery, testing and parallel development. As you start refining the product strategy (e.g. via business model canvas/value proposition canvas) you should also be refining the software strategy. This makes the product roadmap more feasible.

By using the idea of Minimum Valuable Increment (MVI) we can include both the external deliverable and an internal focus on how to make it. These increments should be delivered in vertical slices. We should review the product and technical strategies together and formally take time to plan milestones - software design should be an explicit part of the business strategy.

My key takeaways: The product strategy is more powerful and effective when it takes into account technical feasibility.

The Lost Art of Software Design, Simon Brown

“Big design up front is dump. Doing no design up front is even dumber.” - David Thomas

When challenged to draw architecture diagrams, workshop attendees produced diagrams rated 7/10. Why not 10/10? Often we saw superficial high level diagrams without any useful detail like technology choices or process separation (microservices/monolith). There is a way of drawing architecture diagrams - UML. But it is perceived as old fashioned, anti-agile and controlling. The point of design is to come up with a solution, so it should be detailed.

Based on the diagram, we should be able to answer:

  1. Is that what we are going to build?
  2. Is that going to work?

Up front design creates a good starting point and direction. This can change over time and that’s fine, but you can still make the best decisions at the start. Choice of implementation strategy are not implementation details, they are significant decisions that should be made explicit and not just left to a single implementer.

These significant decisions are:

  1. Programming languages
  2. Technologies and platforms
  3. Monolith, microservices or hybrid approach

We can use UML. We can also use the C4 model (Simon invented) to zoom in and zoom out. Teams should be able to focus on important design questions instead of asking about what lines/colours/boxes mean. You can identify risks on these diagrams (risk storming). These diagrams are ready to implement when there is a clear shared understanding of the up front decisions.

My key takeaways: We are allowed to do up front design when we are doing agile. We should do more up front design as part of design and include more detail about the solution. This makes conversation easier and prevents rework.

What does GREAT Architecture Look like?, James Birnie

Architecture is the sum of all the stuff that is hard to change. Your competitiveness is your ability to change.

We should use the best tool for the job, optimising for speed of delivery. It’s bad to tell developers what tools to use just based on standardisation. We should trust developers more and empower them, we shouldn’t limit tool choice by being dictators. Instead of telling people to document, we should communicate better. Also, we should avoid “architectology” where we have to talk to lots of people to get anything done.

Good architecture looks like The Twelve-Factor App. We can measure architecture via fitness functions, see Building Evolutionary Architectures. One edgy example is measuring coupling with tech lead meeting times. We can also use the 4 Key Metrics from Accelerate, fixing weakest areas one by one.

My key takeaways: architects can upset people if perceived as dictators, we should enable instead of instruct.

The Gordian Knot, Alberto Brandolini

The main differentiating factor from Accelerate was loose coupling, not technology. Stop splitting around data, split around behaviour. If you split in the wrong way to microservices, then you are in trouble because you can no longer refactor with an IDE (like you could a monolith).

Design choices define interactions that shape culture. There are different winning behaviours for each interaction.

Pink says drive is autonomy, mastery and purpose (Pink motivation video is a 10 minute game changer). You can check this against projects/teams. Difference for individuals is strong drive to “Just a job”. If you have bounded contexts as units of clear responsibilities, then we have the autonomy to take responsibility for our own changes and mastery over that context.

We should look at our feedback loops. We want focussed learning, improvements and fixes from small feedback loops. “People won’t improve a system if they won’t stay around long enough to see the benefits.” - Have you repainted a hotel room? We can look at the time people stay in a system versus how long we take to enjoy the benefits. Systems do not improve if the loop is too long. We can look at feedback cycles on refactoring, TDD, architecture, training. With long feedback loops negative feedback becomes a blame game, speed matters.

Watch out for reward deprivation systems:

  1. Releasing in 15 months for small possibility of positive feedback with random negative feedback for things not your fault in between
  2. Microservices that are coupled and guaranteed to fail (scapegoat)

“This job will allow you to experience a continuous state of fear, anger and guilt.”

We should make safety, responsibility and pride as key parts of our design. Loosely coupled results in autonomy. Number of meetings needing to agree affects purpose. Externals we depend on affect autonomy. Trade-offs affect mastery.

The bounded context is unit of language consistency, purpose, responsibility and pride.

Books: Event Storming and Domain Driven Design (first 15 years)

My key takeaways: look for the winning strategies on our teams and make sure they end in the result we want.

Testing Microservices: From Development to Production, Daniel Bryant

To test microservices you should replace unowned components with test doubles. Our developer machines aren’t powerful enough to test all the microservices at once. Isolate services for loosely coupled tests. Include tests that resemble production (including security).

Agile Testing Quadrants is a useful model for driving conversations:

Agile Testing Quadrants

My key takeaways: it is hard to test a distributed system in its entirety so you have to mock out services.

Feedback Loops for Software Delivery, Gojko Adzic

Feedback loops allow us to control things we don’t understand. Examples we all know include house thermostats or showers. Both of these have oscillations between the person and complex system - you change the control and you have to wait to see if the change has the result you want. This delay causes oscillations as you have to keep adjusting the input to get the output you want. Feedback loops are everywhere, including in software development. Changing priorities can cause oscillations at work in the feedback loop between product manager and development team.

In agile, you can spend a lot of time on a failing project if you only measure progress internally. Effort metrics like story points represent how much work was done rather than how much value users are getting out of that work. Frameworks like SCRUM can be seen as sensors in a feedback system that can tell us when things are wrong, but we need to talk to end-users to get feedback on progress. The idea frameworks like TDD lead to good design is wrong, because it doesn’t include the feedback loop from the customer. You need the feedback to make a good design.

Feedback loops are important to keep businesses running. AWS allows you to send traffic to a replacement version of software and scale up checking for a drop in sales or drop in conversion rate. You don’t need to understand the detail here - if the new version is worse, kill it! As a counter example, Knight Capital Group was brought down (75% value in 45 minutes) by a deployment bug but couldn’t fix it in time due to lack of a feedback loop.

People usually measure things that are not valuable as measuring important things is expensive. Most people never have any direct interaction with end-users and half of the industry is not aware of what’s happening on client devices via error reporting. This can be a false economy.

My key takeaways: you are likely to fail if you do not have feedback loops, and it’s worth thinking about them explicitly.

TDD with Petri Nets, Aslak Hellesoy

We can use visual Petri Nets model editors to make sure our code conforms to a model. This can act as living documentation.

My key takeaways: it is probably to understand a Petri Net animation than having to read all the code.

What’s Machine Learning Got to Do with It?, Frances Buontempo

I enjoyed this talk without making notes.

My key takeaways: feedback is important for both machine learning and human learning. There is no learning without feedback.

Balancing Forces in Software Design, Mashooq Badar

The context frames the solution e.g. microservices or monolith, and our challenge is to come up with a solution that balances all concerns. Good design does this optimally. There are conflicts between forces/concerns and here you have to find a midpoint. We don’t look at software like this but we could get this insight by looking at the solution holistically. We need to think about our context now and in the expected context in the future.

When we come up with a solution based on a lexical set of words (checkout) we are constraining the solution by those words, and these may not be the best model for our particular problem.

We should look at the forces associated with the problem:

  1. How to test is not a problem when writing software, it is the solution to another problem. The software system has forces like “many payment methods”, “search needs to be accurate” and “evolve with the changing requirements” and we can balance these constraints to find the solution to how to test our software
  2. With an API the forces could be “payments must go through PCI compliant infrastructure”, “teams divided over organisational boundaries” and “multiple individuals developing at the same time” and again we have to find an optimal balance to these forces.

The context section of an ADR is defined to include the forces at play but we do not include these generally in our ADRs.

We can visualise forces in a graph structure and then focus on clusters. We have to see the whole thing to understand how context changes affect the balance of the solution. Event storming allows you to explore the problem bottom up, which can avoid some bias from choosing the wrong lexical set/words to represent the problem. Sketch notes can help showing the whole and the structure of the problem.

We are “Armed with solutions looking for problems!”

We resolve conflicts with design decisions.

My key takeaways: modelling forces is important for coming up with optimal solutions. We don’t often do this explicitly as a whole team. It is probably better to do this explicitly with more people. It would be great to sketchnote the problem to make it easier to understand.

Software Contracts Or: How I Learned to Stop Worrying and Love Releasing, Seb Rose

Another way of looking at the test automation pyramid is by removing labels and having most tests testing small parts of the application (at the bottom) and fewer tests testing the whole application (at the top).

Wordless Test Triangle

We use test doubles to replace dependencies so that we can test smaller parts of the system extensively (lower down the pyramid). The test doubles must behave the same as the real systems in both success and failure cases. Replacing dependencies can be necessary to generate errors because you may not be able to trigger some error scenarios like server down in the real system.

Contracts are an agreement between the client and supplier. The client expects a benefit but has and obligation use the supplier correctly. We get the explicit contract documented but don’t always understand the implicit contract.

We can expose this implicit contract via contract tests. A contract test runs against both the real system wrapped by an interface (don’t mock what you don’t own) and the test double which implements that interface. The contract test itself is composed of an abstract class plus two concrete implementations to test both the real system and test double. Each test in the parent base class runs against both interfaces.

When promoting microservices you have to test all possible versions of consumer and provider via contract tests. Rolling all these contract tests by hand is very difficult when you have a lot of microservices. Pact creates a contract between consumer/provider and provides a tool to validate them independently of each other (see Pact - 5 minute guide). Generally the consumer will adapt if a test fails but they could also communicate with the provider to make changes. Consumers publish a json to the Pact Broker (see Pact - Sharing Pacts with the Pact Broker) and that allows providers to fetch these contracts and run them against the latest version. A small tool can-i-deploy checks the matrix of consumer/producer to make sure they have been verified to work together.

My key takeaways: contract testing can make it safer to use test doubles because there is more evidence that the test double matches the real system.

See also