I'm a big fan of delegation and accountability when it comes to governance strategies, and the SharePoint development lifecycle stands for no exception. If any of you remember, I lead a large SharePoint team at a fortune 500 company for a few years, and I have some battle wounds to prove it. Fortunately for me, the IT department had great senior management, who valued environment isolation which saved us a lot of pain in the end, and I was able to glean some great wisdom from their experiences. However, looking back to those days I can't help but to wish I had done a few things differently. I can't help but to think that more accountability and even more environment isolation would've been a great asset to us, despite what could've been perceived as an inconvenience.
Many companies I see don't even have a development lifecycle governance plan, and those that do typically structure it so that Production servers are locked down, but everything else is fair game. Developers, testers, business analysts, and all their moms are all poking around in the various environments and before you know it, you have a mess to clean up and you're always crossing your fingers when you make a production change. I definitely know from experience that when I was working in a test environment, I would often get frustrated because my web front end's configurations are not consistent and didn't know why, or something was working a few days ago, but for some reason it is not now. The of the root problem always goes back to there are just "too many hens in the kitchen". You can spend a lot of wasted time fighting fires in environments, and a little more governance can save you a lot of money when you think of how much that time is costing you.
The figure below is my White Ivory Tower model of how I would structure the deployment process in a medium or large server farm. Much of this article isn't necessary SharePoint specific, but I am often surprised as how often people ask me how I would go about this, so it seemed like a relevant "brain dump" for a SharePoint blog nevertheless.
You'll notice the three recommended farms for any one production farm, as well as the different "gates" where I will later be describing the roles of the different gate keepers and how those individuals will fit into the proposed governance model.
Key Features of this Model
- SharePoint Solution Packages are the key to deploying updates and customizations across the farms.
- Each farm has an individual that is the dedicated gatekeeper, and is held accountable to environment quality and cleanliness, as well as knowing what is changing when, and for what reasons, and how to back out those changes if something goes wrong.
- Each "gate" requires approval from a designated Change Approval Board (CAB), before changes can be made to the environment.
Key Benefits of this Model
- Environment stability and dependability across all farms is sure to be high because of the required change approval process. If something breaks, you'll know what change broke it, and how to roll it back.
- Quality Assurance and test results are sure to be more dependable because of the stability and consistency in the environments, making for a better product in the end.
- Production roll-outs are sure to be less risky because the implementation steps have been tried and tested.
Environment Roles
DEVELOPER VPCs
Individual development virtual personal computers (VPC) are changing what the role of a development environment is. It is no longer the "sandbox" it once was, where every developer has full administrator rights, and can actively develop and unit test their code. The major downfall to this sandbox mentality is the inevitability of developers stepping on each other's toes, and the environment is sure to become a mess. It is especially necessary with SharePoint development to provide each developer with their own isolated environment, because SharePoint development involves a lot of hands on server configuring and deploying. If each developer has their own private and isolated sandbox they will be much more effective because they won't have to worry about other people messing up your work, and they will always have their own stable place to work – no more development delays because you need to rebuild the server because it got too irregular. With VPCs, you can play, prototype, and dig in to your heart's content, because if you ever can't get something working, just throw the VPC away and grab a working copy. This is much faster than re-installing Win2k3 and MOSS on physical hardware.
Another key feature with VPCs in this model is the development of SharePoint Solution Packages. Solutions are pivotal to keeping things consistent in SharePoint, and they also make for a great tool for a team of developers. Notice on the left of the diagram that these solutions have their roots at the VPC level, and are promoted all the way up to Production. For more information, see this blog post of mine on how to effectively manage your SharePoint customizations across a large team.
DEVELOPMENT
The development environment should be where all the work of the individual developers comes together before it goes to the test environment. A better name for it might be the "integration" environment.
This should be a clean, stable, and dependable environment. Ever lose a day or two of development because this wasn't the case? As the diagram shows, it doesn't need to be a complicated farm, one server will do. However, a big feature of this farm is what it doesn't have. Two key things it doesn't have are that no developers have access to remote into it, and visual studio is not installed on it (except remote debugger of course). The team technical lead, lead architect, or developer resource manager should be the traffic cop of this environment and should facilitate the integration of the different efforts of his/her developers. In doing this, the environment itself will be reliable, but just as important, the integration of the software itself will be more dependable and you'll catch more problems yourself, before test or (God forbid) production does. A big aspect of the accountability that is placed on Gate 1, is the quality of the software that is being promoted to test. Isolating the majority of the development team out of this environment will help immensely in the gatekeeper's effort to do this, because they'll know what changed, when, and why.
GATE 1
Gate Keeper: Technical Lead or Lead SharePoint Architect
- Key Point of integration of updates coming from various developers on the team.
- Gatekeeper can better hold developers accountable to well unit tested code because of tighter control and a closer eye on changes and integration points.
- The gatekeeper is responsible to develop the implementation steps for product launches and updates, and the beginnings of any documentation for updates start at Gate 1.
QUALITY ASSURANCE (TEST)
Obviously no developers should have access to this environment, because of the need to have quality and reliable tests. A frequently changing test environment is sure to produce a poorly tested product because the variables are uncontrollable and ever changing. I will even go so far to say that the Technical Lead should not have access here either. This is for two reasons. Firstly, a good production roll-out should have properly documented implementation steps as well as back out steps if something goes wrong. These steps NEED to be tested as well! Good migration steps are just as important as good code! The only way to test these steps is by having somebody other than the author walk through them, and the best place to do that is the test environment. Secondly, SharePoint solution packages make deployments really easy in SharePoint, and you won't need someone very technical to do updates. Your QA staff should be doing a "dry run" of your production roll-out steps. Even if solution packages where not available, the documented steps themselves should be detailed enough for a monkey to walkthrough them, and if they're not, you shouldn't go to production.
Another key note about this environment is that it really needs to be treated like production. Things shouldn't change here hap hazardly, and I'd even recommend setting a Change Approval Board (CAB) for gate 2, and putting a max of two changes per week. Too many changes per week will result in bad tests, unless you don't have very many test cases I suppose.
GATE 2
Gate Keeper: Quality Assurance Lead
- First line of defense for bad production roll-outs, and the first place where a CAB is necessary. The Technical Lead, QA Lead, and typically the project manager should mutually sing-off on any updates to the QA environment.
- Typically the QA Lead will perform all the updates that get approved through the gate, not the Technical Lead. This will fool-proof the steps for an eventual production roll-out and the steps themselves will be of higher quality. This will reduce the risk of the eventual production upgrade.
- Rough draft of implementation steps should be provided as a deliverable to the project manager during the CAB approval request, showing additional due diligence and preparation for an eventual production roll-out.
PRODUCTION
This is the environment that most companies get right. Nobody has access to this except for individuals in central IT operations, and changes to this environment are never hap hazard. A good IT department will usually have a department wide production change approval process to get through, and changes will usually never happen more than twice per month, unless it's an emergency.
GATE 3
Gate Keeper: IT Operations Administrator
- Department wide CAB approval required.
- Final draft of implementation steps required.
- Steps implemented by someone in central IT operations, not someone on the project team.
CONTENT STAGING
A lot of things in SharePoint don't require a strict approval process to make production changes. The point of SharePoint is easy, and fast content collaboration, and the afore mentioned gates are really meant for custom, home grown solutions - not content. A content staging environment is often desirable if you have a large company and high risk content pages with a large amount of people collaborating on that content. For instance, you can bet the home page of a billion dollar company needs an approval workflow before it can be changed. This can be done out of box with the MOSS publishing templates, but often times corporations what to further isolate the content approval to a whole separate farm, where the risk of "leaks" or unwanted changes is less likely. In that case, you'll typically have another gate keeper that is approving content, and that content is moved by pre-scheduled content deployment jobs setup in Central Administration. This is an entirely optional environment, and its usefulness depends largely on risk, and the number of individuals managing content.
Additionally, I'm a firm believer that if a company is doing some internal SharePoint training to their end users, any training sites don't belong in either Production or QA. In fact, I would be closer to holding training in Production than I would QA, because of my strong feelings toward keeping that environment isolated. However, this content staging environment is the perfect place to spin up training, or other "sandbox" sites.
SQL CLUSTERS
What can I say? Anybody with enough of a budget to build four farms with ten servers, is surly going to be able to afford to cluster their backend, right? The benefits to having redundant hardware go without saying…
Conclusions
If you made it this far and you actually read everything I wrote, I'm thrilled because honestly this write up is admittedly a "white ivory tower" model. It is a model that lends itself to take short cuts. Setting up those three gates will take a lot of perseverance and determined leadership, and for many organizations will fly in the face of company culture. However, I do feel convicted that with the right individuals in those key leadership roles, this governance model will in the long run save a lot of money because of the savings in wasted time.
Anybody have a different model or any suggestions that have worked for them?
Phil