The First Journey Is Complete

Jul 28th, 2012 | Comments

We have now reached the end of this journey. See the official announcement with the full list of acknowledgements here. You can check out the guide and the corresponding reference implementation at http://aka.ms/cqrs.

We also welcome your feedback and stories with your own experiences. Check out the section of the guide called Tales from the Trenches which includes mini-cases studies of other teams implementing the CQRS and ES patterns in the real world. Some of them validate our findings, others provide different perspectives. Take our project for a spin, explore our findings and those of others, and tell your story! Email us at cqrsjourney at microsoft dot com.

There’s still a lot to explore. We would like to encourage the community to improve and extend our implementation of the Contoso Conference Management System. The github code repo remains open and we’ll continue to review any and all of your pull requests.

Once again, your feedback is very important. Based on it, we may or may not decide to embark on the second journey in the future. For now, we remain enthused!

V3 Final Release

Jul 19th, 2012 | Comments

This is our third (and final for this project) pseudo-production release of the Contoso Conference Management System. The plan was to focus on hardening and optimizing the system, and to see if we could perform the no-downtime upgrade that we didn’t manage to achieve in the previous update from V1 to V2.

There were no new business features implemented in this release, but we did make significant changes to the code as part of our efforts to harden and optimize the system.

Hardening the system

As part of this release, the team revisited the RegistrationProcessManager to ensure its resilience to a wide range of potential failure conditions. These changes fall into two categories:

Resilience in the face of duplicate events: the system must guarantee idempotent behavior, so that if for any reason it tries to process an individual event more than once, then this does not result in inconsistencies.
Robustness when sending commands: the team implemented a pseudo-transaction to provide guarantees that the system always sends any outstanding commands from the RegistrationProcessManager whenever it saves the process manager’s state. A pseudo-transaction is necessary because it is neither advisable nor possible to enlist the Windows Azure Service Bus and a SQL Database table together in a distributed transaction.

No-downtime migration

The team implemented a no-downtime migration strategy to get from V2 to V3 and successfully performed this migration on the live system deployed to Windows Azure. To help with this process, we created an ad-hoc processor that keeps the V2 read-models up to date during the migration process. This was necessary because we replace the V2 processor with the V3 processor before we switch to the new V3 front-end.

Optimizing the system

The optimizations added by the team were optimizations to improve both the performance and the scalability of the system. These optimizations took the majority of the team’s effort in this release. As is often the case with performance tuning, addressing one bottleneck frequently revealed another elsewhere in the system. The optimizations occurred in two places; first, in the UI flow, and second, in the infrastructure.

UI optimizations

The optimizations implemented in the UI flow were designed to streamline the behind-the-scenes processes that take place between the time that a registrant selects seats for a conference and the time the registrant pays for those seats. The changes mean that the user is far less likely to have to wait at any point in this process, but still ensuring that the system checks that the requested seats are available before accepting payment. It does this in two ways. First, when there are plenty of seats available for a conference, the system assumes that the reservation will succeed and so performs the reservation asynchronously while the registrant is completing other screens in the UI and before the regenerating gets to the payment stage. Secondly, we reorganized the Order entity to perform the totals calculation as one of its first tasks instead of one of its last tasks, making the information available in a read model much sooner that was previously the case.

Infrastructure optimizations

These were the most effective optimizations, but took the longest time to identify and implement. The optimizations included:

Handling the majority of I/O operations asynchronously. This is recommended practice, but makes the code much harder to read especially when combined with transient fault handling logic.
Processing some commands in-process instead of pushing them through the service bus. In some cases it is possible to handle commands in the same process where they are sent, and this therefore reduces the number of messages we are sending through the service bus and reduces the latency in delivering those command messages.
Implementing snapshots in the event sourcing sub-system. This is a common optimization to event sourcing, and we implemented this in memory using the Memento pattern.
Caching read-model data. Again, this is a common optimization, but this raises the question of how frequently the cache is updated.
Tuning the Windows Azure Service Bus by using features such as filters and by implementing parallel publishing, parallel sessions, partitioned topics, and more. For proven perf optimization practices, see this guide.
Using sequential GUIDs to optimize insert behavior into Windows Azure SQL Database tables.

For full details of these optimizations and more, take a look at Chapter 7, ”Adding Resilience and Optimizing Performance,” in the journey guide. This chapter also identifies some of the additional optimizations we would like to perform given time.

You can read about the V3 release in detail here, and about all our lessons learned from the project here. The code base corresponding to this release is available under the V3-pseudo-prod tag. You can also see the list of implemented stories.

You can read about the previous V2 release and about the previous V1 release.

And of course, you can access the system with all prior data migrated:

Plans for V3

May 30th, 2012 | Comments

For our third and final release of the Contoso Conference Management System in our CQRS journey, we’re planning to focus on hardening and optimizing the system. We also intend this to be a no-downtime upgrade - something that we didn’t achieve last time around.

You can see the the announcements for the previous two releases here V1 and here V2.

Hardening the System

The Orders and Registrations bounded context has contained the RegistrationProcess Process Manager (are what sometimes is referred as a Saga in the CQRS community) right from our V1 release. This piece of the system is responsible for coordinating the activities of the aggregates in this bounded context and the Payment bounded context by routing commands and events between them. We’ve always been aware that a failure here could have adverse consequences for the system, especially if the aggregates got out of sync with each other, or we ended up with zombie processes. We want to make sure that the system is robust in the face of these specific failure scenarios.

The RegistrationProcess should be able to recover from the following:

Crashes or is unable to access storage after receiving an event and before it sends any commands.
Crashes or is unable to access storage after saving its own state but before sending the commands.
Fails to mark an event as processed after it has finished doing all of its work.
Times-out because it is expecting a timely response from any of the commands it has sent, but for some reason the recipients of those commands fail to respond.
Receives an unexpected event (i.e. PaymentCompleted after the order has been expired).

Optimizing the System

The optimization effort will focus around the UI flow. Currently, when a Registrant is ordering and paying for seats at a conference, the UI waits for some processing to complete in the domain (such as checking for seats availability), before displaying the next page to the Registrant. We plan to streamline this process to make the UI much more responsive, while still ensuring that a Registrant doesn’t pay for seats that the system couldn’t reserve (this is a strict domain constraint).

You can track our progress against the V3 milestone.

Feel free to provide feedback on code/guidance and contribute.

V2 Pseudo-Production Release Is Out

May 21st, 2012 | Comments

This is our second pseudo-production release of the Contoso Conference Management System. The plan was to explore the migration process to understand better how this works in a CQRS/ES implementation and to see if we could perform a no-downtime upgrade with our current infrastructure.

The picture below depicts a portion of our migrated read models (only some of which are actually in SQL Azure).

The following user stories made it into the release:

Providing a better user experience for zero-cost orders (in V1 the registrant was still taken to the payments system even if there was nothing to pay).
Displaying the number of remaining seats of each seat type in the UI (since there was no read model for this in V1).
Applying UX designs to the Conference Creation/Management BC.

Unfortunately, it was not possible this time to perform a no-downtime upgrade. This was due to:

The fact that we weren’t storing all the integration events needed in order to regenerate some read models, nor did we have an API for querying the state of the system from an external Bounded Context. So we now have a message log for every message that goes through the service bus.
Because of this, we have also decided to migrate the event store infrastructure to account for the additional metadata that the message log requires, and we will use as the base for replaying events when regenerating read models (as the message log is much easier to query and is already sorted by time).

Our migration resulted in ~6 min downtime.

Now that we have the infrastructure in place, a no-downtime upgrade seems to be an attainable goal for our V3 release.

In addition to the behind-the-scenes changes listed in the previous post, the team made the following changes:

As mentioned above, there is a now a message log that persists all commands and events. This grew out of the need to persist the integration events. In addition to regenerating read models from events, this is also intended to future-proof the application by capturing everything that might be of interest at some point in the future.
The Conference solution now includes a migration program to migrate the data to the new event store, re-create the integration events that the V1 release did not persist, and rebuild the read-models with additonal data.
Several bug fixes (see the backlog for V2 or the commit history)

You can access the updated version of the application as a Registrant or as a Business Customer:

We are now starting work on V3 and you can track our progress against the V3 milestone.

You can read about the V2 release in more detail here and about the V1 release here.

Plans for V2

May 15th, 2012 | Comments

Variety is the very spice of life
- William Cowper, Olney Hymns (1779)

Following on from our V1 release of the Contoso Conference Management System we are already well on the way toward our second pseudo-production release. This will be our first experience of handling data-versioning and software updates in an application that implements the CQRS pattern and that uses event sourcing. We plan, if possible, to be able to perform a no-downtime migration; if this proves not to be possible, we’ll make whatever changes are necessary to make sure that we can perform the V3 migration with no-downtime.

We have three major user stories on our backlog for this release:

Providing a better user experience for zero-cost orders (at present the registrant is still taken to the payments system even if there’s nothing to pay). This will involve changes to our registration workflow process, adding some new events, and removing some events that are no longer needed. The Orders and Registrations bounded context uses event sourcing, so the system will need to work with both the old events (that are still in the event store) and the new events when the the new version of the workflow process is introduced.
Displaying the number of remaining seats of each seat type in the UI. This will require a new read-model that will need to be seeded with the correct starting values. This is because in the V1 release the Orders and Registrations bounded context was not receiving and storing the information it needs to calculate these values.
Adding support for discounts for seat prices through the use of promotional codes that the registrant can add during the ordering process. This introduces a brand-new bounded context implemented by a different team (Adam Dymitruk, one of our advisors), that needs to be integrated with the existing bounded contexts. This will also result in further changes to the registration workflow process.

There are also several changes that will take place behind the scenes, some of which are related to the user stories described above:

Modifying the application to accommodate changes to the set of events used in different versions of the system.
Improving the robustness of the system by ensuring that commands are always idempotent.
Improving the robustness of the system by ensuring that, where required, messages are always delivered in order.
Reevaluating our strategy for persisting integration events.
Evaluating some options for optimizing the command handling within the system.

We plan to have V2 ready shortly to give us time to complete V3 before the end of the project.

You can track our progress against the V2 milestone.

Feel free to provide feedback on code/guidance and contribute.

Specing Out End-to-end Scenarios

May 14th, 2012 | Comments

From mockups to features

To spec out the scenarios for our conference management system, we’ve started with conversations with the domain experts. We’ve envisioned the user interactions with the system and captured them in mockups, like the one below:

Conference management mockup

Mockups helped us in many ways:

Explaining our ideas in terms of UI interactions to the graphic designers.
Having discussions with the developers in terms of desired functionality.
Establishing, discussing and refining terms from our ubiquitous language.
Evolving our understanding of stories via “what if” scenarios rooted in the mockups.
Becoming a foundation for the future formulation of the acceptance tests.

Based on these we’ve proceeded to develop acceptance tests using SpecFlow. SpecFlow provides an integrated experience for writing and executing tests in the BDD (Behavior-Driven Development) style within Visual Studio. The resulting tests helped us capture, distill and refine the requirements in an unambiguous way. Here’s an example of a SpecFlow acceptance test (Feature):

Conference management Feature sample in SpecFlow

The initial step was to write the project specifications as Feature files using the Gherkin language syntax. They served well for the purpose of evolving and discussing the stories among developers and domain experts. However in that form, they could not be executed. So, from the testing perspective they would need to be executed manually if we wanted to verify whether the implemented functionality has, in fact, met our expectations.

From features to executable specs

Later on, we added the underlying plumbing to make sure that those features/acceptance tests can actually execute against our system-under-test.

Executable spec test run

We used two different approaches, both of which are valuable.

Because our application is a web application, for testing directly via the UI, we used WatiN, an open source library for automating Web browser testing. These tests help us gain confidence that the key end-to-end happy paths through the system work. They are fragile, however and require substantial maintenance effort.

Another approach for sub-scenarios is to use subcutaneous testing (or testing just beneath the UI) by directly calling the controllers (i.e. Features\Registration\All Features end to end\SelfRegistrationEndToEndWithInfrastructure.feature). This way we avoid coupling with the UI but it may appear to be more complex since it requires some internal knowledge of how the system is implemented. With time, however, it became much easier to use this approach and we were able to reuse a lot of infrastructure as well as internal reuse of the parts of the test actions (these are actually called Step Definitions in SpecFlow).

From executable specs to exploratory testing

Of course, having the executable specs in place by no means preclude other types of testing. We recognize that good end-to-end acceptance testing requires breadth, depth, variability of actions and data, and rich scenarios, both plausible and implausible. While we value test automation for some of the scenarios, we also have professional testers on our team who are performing exploratory testing, performance testing, content testing etc. They too use the executable specs in their arsenal of tools and artefacts. The executable specs inform them of the goals of the system and some concrete scenarios.

Finding your way around

The SpecFlow Feature files with the business specifications were structured in the following way. All the end-to-end scenarios were grouped in a subfolder similar to “Features\Registration\All Features end to end” so that we could provide a quick overview of the epic (Registration in this case) and two basic reconnaissance paths through the system: one happy path and one sad (failure) path.

Then, in separate folders we grouped the variations of each scenario within the same epic – for example “Features\Registration\Individual Reservation” exercises logic for scenarios specifically for individual reservations with various seat type selections. This way we can easily separate the full end-to-end acceptance tests from more fine-grained integration tests with various permutations for each underlying scenario.

Regarding the implementation details, there are some general support classes that provide many shared infrastructure functions used in most of the preconditions and assertions throughout the Step Definitions.

As a test runner we used xUnit.net underneath, which is supported by SpecFlow and integrates nicely with some well-known external tools like ReSharper, CodeRush, and TestDriven.NET.

Getting involved

One of the ways that you can get involved with us is to review and refine the existing acceptance tests or contribute new tests for existing scenarios or even planned (and not yet implemented) scenarios.

Final note

SpecFlow is just one tool in our testing armory. It is particularly useful as a way to help capture requirements from our Domain Experts in addition to verifying the behavior of the system.

Announcing V1 Pseudo-Production Release

May 8th, 2012 | Comments

As part of our CQRS Journey we plan to have several pseudo-production releases of the Contoso Conference Management System so that we can explore the real-world issues of data-versioning and software updates in an application that implements the CQRS pattern in several bounded contexts. Over the next few weeks we plan more pseudo-production releases and as we go through them we will be describing our experiences in the guidance documentation.

Today we are releasing the first of these pseudo-production releases: Contoso Conference Management System (V1).

The current release consists of three bounded contexts:

Conference Management (implemented using a traditional CRUD-style architecture).
Orders and Registrations (implementing the CQRS pattern and using event sourcing).
Payments (implementing the CQRS pattern and using a SQL-based storage).

Keep in mind, while based on the real-world requirements, it is still a sample application. It doesn’t include the following:

Authorization (on the backlog for V2).
Styling for the conference management web site (on the backlog for V2).
Integration with a real third-party payment processor (out of scope for the guidance project; we’ll keep the simulated one).
Conference details (description, speakers, tracks, sessions, etc.) - you will see static data in the registration hub for now.
Perf testing (on the backlog for V3).

If you would like to explore the code in the V1 release, you can download it from GitHub here. To help you get started with building/deploying/running the V1 release code see the release notes here.

The current release uses the Windows Azure Service Bus to provide its messaging infrastructure and can be deployed to Windows Azure. You can access a deployed version of the application as a Registrant and as a Business Customer. Please feel free to use the application and to generate some sample data for us so that we can migrate it for our V2 release.

Note: You can also run a copy of the application locally without using Windows Azure or Windows Azure Service Bus as described in the release notes.

We would appreciate your feedback on both the code and the written guidance. You can submit your reviews either via GitHub (comments, pull requests, issues) or via our lightweight review system.

If you’d like to participate in this sample system evolution and work on any improvements of the existing bounded contexts or new ones, check out our backlog. We’ll be happy to see your pull requests!

Open Call for Enhancements to Event Store Implementation

May 1st, 2012 | Comments

In the process of our conversion of the Registration process BC to use event sourcing, we have realized we needed some robust event sourcing infrastructure (both for storing the events, and publishing them). We wanted this implementation to be Windows Azure-friendly, as this is where we’ll bedeploying our system to.

We have now built our initial implementation of the Azure table-based event store and it’s ready for a broad community review and for enhancements. It seems to be self-contained enough and will benefit from crowd-sourcing. Get it from here:

Repo: https://github.com/mspnp/cqrs-journey-code
Infrastructure solution: https://github.com/mspnp/cqrs-journey-code/blob/dev/source/Infrastructure.sln

We really look forward to your participation in getting this event store production ready!

Notes:

This is currently not intended to be a general-purpose, pluggable component. We’ve designed and developed it to fit with our sample application.
There’s another implementation based on SQL server. It is meant for running the sample app locally and it’s not ready for production. Nor is it in our scope to make it production-ready. So, when prioritizing where to direct your effort, please choose the Azure table-based implementation.

The Road to V1

Apr 12th, 2012 | Comments

Handling Complexity and Change

DDD and ES reduce complexity and cost of change. They would look like useless magic in stable and predictable projects.
- Rinat Abdullin, CQRS Advisors Mail List

Introduction

One of the topics discussed on the CQRS Advisors Mail List was how to make the CQRS Journey Project more realistic. In the real world, projects face uncertainty, risk, and volatility in requirements. One of the benefits of applying ideas from DDD and using event sourcing is to make that uncertainty, risk, and volatility more manageable.

As part of our CQRS Journey, we are planning a pseudo-production release of V1 of the Contoso Conference Management System that includes the core conference management functionality. We will then explore, and document our experiences of enhancing and maintaining the V1 release as we continue to work on the Conference Management System.

Contents of the V1 Release

The following sections outline our planned functionality for the V1 release.

Conference Management

This will be a simple CRUD style bounded context that enables a Business Customer to create and manage conferences. A Business Customer will be able to:

Create and update the conference data such as name, description, and dates.
Publish and un-publish a conference.
Manage the Seat Types (e.g. full-conference, workshop, cocktail party) available at a conference.
View a list of Attendees.

Registrations

This bounded context will implement the CQRS pattern and use event sourcing. The functionality in this bounded context includes:

Enabling a Registrant to create an Order for seats at a conference.
Handling partial fulfillment of Orders.
Enabling a Registrant to locate a previously created order using an Access Code.
Enabling a Registrant to assign Attendees to Seats.
A task-based UI.

Integration

The Conference Management and Registrations bounded contexts will be integrated so that any changes made by the Business Customer to the quota of Seats at a Conference are detected by the Registrations bounded context. Information about the Attendees is also shared between the two bounded contexts.

Payments

Part of the Registration Process includes the Registrant making a payment for the Order. The system will include a fake payment service that simulates handling payments by credit card and by invoice.

Infrastructure

The Contoso Conference Management System will be deployed to Windows Azure and use the Windows Azure Service Bus to provide its messaging infrastructure for Commands and Events. It will also be possible to run the application locally without requiring a Windows Azure subscription.

More Information

You can follow our work in these two GitHub repositories:

You can track our progress against the V1 release milestone:

V1 Release Milestone

Sample Application Overview

Mar 30th, 2012 | Comments

Contoso Conference Management System

This post summarizes the key features and functionality of the proposed Contoso Conference Management System that we are builing in our CQRS Journey projects. You can read our motivation for choosing this domain here and here.

Introduction

Contoso plans to build an online conference management system that will enable Contoso’s customers to plan and manage conferences that may be held at a physical location. The system will enable Contoso’s customers to:

Plan the tracks, sessions, and speakers that make up a conference.
Manage the calls for speakers and paper submission process.
Manage registration (the sale of seats) for the conference.
Manage the conference on site, including badge printing and attendee lists.

The following sections describe these functions in more detail.

Overview

The Conference Management System will be a multi-tennant, cloud-hosted application. Business customers will need to register with the system before they can create and manage their conferences.

Conference Planning

When a business customer creates a new conference, he must first define some characteristics of the conference such as:

Will the paper submission process require reviewers.
What will be the fee structure for paying Contoso.
Assigning key personnel such as the Program Chair and the Event Planner.

Building a Program

The business customer must define the conference program. For a simple conference this may be defined directly. For a more complex conference this may involve several steps:

Identifying Tracks and Track Chairs.
Identifying Reviewers.
Making calls for submissions.
Reviewing the papers and assigning speakers to sessions.
Finalizing Track programs.

Selling Seats

The business customer will define how many seats are available for the conference and also for any events that are part of the conference that only limited numbers may attend.

The system must manage the sale of seats to ensure that the conference and sub-events are not over subscribed. This part of the system will also operate wait-lists so that if cancellations are made, then the seats can be re-allocated.

The system will require that the names of the attendees are associated with the purchased seats so that badges can be printed.

On-site Management

When attendees arrive at the conference they should be registered against an attendee list, given badges, and given the opportunity to purchase additional seats for any extra events they would like to attend. The system that runs on-site at the conference should not assume that it is permanently connected to the main conference management system.

Additional Features

Contoso plans to roll out additional features as and when resources are available. These include:

Support for Event Planners to manage equipment, signage, etc.
Tracking attendees at sessions via a barcode on the badge.
Score submissions for session feedback.
Live feed of session scores to track the best sessions.
Reporting feeback to speakers.
Archiving conference materials.
Publishing proceedings.

Ubiquitous language

Here are the working definitions for the terms in our ubiquitous language for the Conference Managament System sample we are building. We will be adding to and modifying this list as the project proceeds…

More info

Follow our progress in these 2 repos (you can set the watchers):