Latest Entries »

I’m attending solo, and am mercifully spared the logistical mischief of traveling with spouse or children.  Here are my best insights for preparation, in no special order.

-         Minimal clothes.  After all, hats and t-shirts will be cast my way all week long.  Don’t bother unless you’re planning on a job interview, or unless you have some awesome shirt like Ghost in the Shell or a Tech-Ed conference shirt from the ancient past, like 2002.

-         Reading material.  I always travel with a few books, for long flights, long waits, or early hangovers.  This year, Classical Chinese, Hastings’ Retribution, and Davies’ The Eerie Silence accompany me.

-         Airline food.  It’s not memorable.  No, really!  Even the pre-con lunches are better!  Pack a couple of sandwiches for the flight.

-         Power strip.  Bring one and make friends.

-         Your feet.  Take care of them.  It’s too late now to start getting in shape for the long hikes of Tech-Ed, so make sure you have your most comfortable shoes.  Not only that, get some moleskin, and pre-emptively apply it to protect your toes.

-         Get up early on Sunday (if you’re in town), and get over to the center as early as you can to register.  You avoid the madding crowds, and have a better chance of getting the style of backpack you want (if you are one to care about such things).  For non-pre-con attendees, your chance starts at 10:00 on Sunday.  Don’t dawdle; others won’t.

-         Don’t stress about taking notes at sessions.  You can get the entire powerpoint and download the video of the session in a day or so, which liberates you to spend your time listening and thinking, rather than trying to seize every syllable for long-term archiving.  This is an awesome feature.

-         Sessions and labs and vendors. Since you can download all the sessions, being there is not so crucial as we intuitively think.  I’m considering spending more time in the proctored labs and chatting with vendors.  That’s something I can’t just go back and get from Tech-Ed Online.    Yet, to be honest, it’s difficult to compare the pleasure of a Minasi session with a follow-the-numbers lab.   Perhaps the greatest resource at Tech-Ed is the strong geek contingent from Microsoft.  Some are mere managers, but here you have the chance to talk to the people who build the apps and OSs and developmental environments we all know and love.  I remember talking with the guy who wrote the code to allow us to pull AD objects from the recycle bin.  Extremely rewarding, and quite unique to this conference.

-         Think about those left behind.  Yeah, you lucked out and got support to attend.  But not everyone did.  Keep them in mind: solicit technical questions from them and make sure to get answers.  Bring back swag and t-shirts.  Write back some emails telling people how the conference is going.  Send notes about particularly good sessions to people who will appreciate them.

-         Keynotes.  Hey, these are streamed!  You can hang out or stay home and listen in!  On the other hand, if you’re turned on by techy power, and like close quarters, then the teeming acres of geeks at the keynote will welcome you.  I’ll be out in the hallway, with all the space in the world, notebook plugged in, alternating between Microsoft glitterati and YouTube.

I’ll be leaving on Sunday for the 2011 SharePoint Conference in Anaheim, Calif.  I’m especially interested in the chance to review and learn new insights from the masters about infrastructural aspects of SharePoint.  In addition, I’m planning on attending some of the show-and-tell sessions to learn about what other businesses have been doing with SharePoint, since the business stories can be strategically valuable, and are often much less predictable.    With technical conferences, the preparations are standard for most people, and best practices are regularly circulated.  Nonetheless, I have a few to share below, few of which that I have come across elsewhere.  Maybe there will be something useful for you as well.

Things to bring:

  1. Moleskin.  Bring it and use it; this is the best prophylactic against blisters, which are all too common when you walk a couple miles a day back and forth between sessions.  A must.
  2. Business cards.  Bring them and trade them.
  3. Bring a power strip.  Outlets are invariably at a premium in the conference rooms, and you will make many friends if you have one of these.  Until you leave and take it with you, that is.
  4. Vendors and birds-of-a-feathers.  Since all breakout sessions are recorded and available on the website soon after taking place, these should not be the exclusive focus of your planning.  Consider the other non-recorded events as well, such as birds-of-a-feather sessions.  In addition, there are some great vendors in the SharePoint workspace; it is worth spending some time getting to know some of them and their products.  Make a point of
    collecting business cards.
  5. Labs. Usually I ignore these, but since I can download all the breakouts, I plan to give some of these a try.
  6. Loot. Maybe, like me, you have enough t-shirts to last a lifetime.  But your co-workers likely do not.  Conference swag is worth getting not just for
    yourself, but for the folks back home.  At least half of the swag I bring back will be distributed.
  7. Download. The sessions are available for three months or so.    After that, it’s either nothing or your favorite file-sharing service.  So make
    sure to download as much as you can, as soon as you can.

While on the subject of SPC11, I’d like to make an observation on its website (http://www.mssharepointconference.,com).  The site has no forums or newsfeeds; the only novelty and populist participation comes from a trickle of tweets and bleets, most of which are simple bursts of zeal.  How unlike the Tech-Ed site, which now has been able to enter year-round availability.  Rather than re-invent the wheel, it would make sense for SPC to leverage the clear success of the Tech-Ed site.

One unique, innovative aspect of Tech-Ed this year was the Sunday night sessions on professional development.  The results were  uneven but intriguing, and augur well for the future.

They began half an hour or so after the standard pre-cons concluded (not enough time to get dinner, but enough time to nosh on cocktail  party food thoughtfully provided).  There were two parts:  a general session with two parts for IT pros and devs alike, and separate sessions for  these two groups.  Alas, I missed the dev meeting.

The general session began with Steven Frost, who speaks well, and clearly thrives in front of an audience.   His first subject was cautionary in nature: be careful what you do online, because it will be searched out by potential employers.  Some examples he gave were  revolting, but reflect the tenor of our times.  Even for IT pros, much of this was eye-opening.    After this shocker, he went on to  dispense more general, mundane career advice, and much of this was a series of bromides like something I would have expected from Dale Carnegie or Poor Richard’s Almanac:  “be on time”, “know your strengths and weaknesses”, and “expect that things will go well” are a few examples.

The guiding principle here is that you are a brand, and you must adjust your behavior on- and off-line to maximize the recognition and  value of that brand.   This is standard business thinking, really, but seldom discussed or embraced in IT, so Frost’s exploration gives it value for many people.  Naturally, the people who would have benefited most from it were not there.

The second speaker was Zeus (really!) Kerravala, who explored workplace IT trends, and their implications for the IT workforce.    This covered standard insights, such as the growing prevalence of smart devices, mobility, and consumer devices used for business purposes.  Nothing revolutionary here.  He did however also explore VDI as an emerging gamechanger which would permit workers to use their own devices without compromising IT needs.    This naturally has great potential for cost reduction, but the risks of data compromise and illegitimate access are resounding.  Moreover, this goes in the face of control-freak organizations and admins which like to control items as trivial and business irrelevant as personal screensavers and wallpaper.    Nonetheless, Kerravala argues these changes are here to stay, and he foresees a very near-term future when IT success will derive from its ability to deliver satisfactory user experience with social media, desktop virtualization,  collaboration, mobile applications, and appropriate levels of security and management.    There is much food for thought here, especially as we consider future directions for our own careers.

Personally, I’d say Frost was vastly the better presenter with more to say about career tactics, but Kerravala had more strategic career insight.

After this, IT pros and devs went their separate ways; I stayed with the former for a very loose panel discussion of career moves and  past choices.  The panel members were interesting, yet I kept hearing very standard, predictable advice, with very little that moved or surprised or  illuminated me.  In addition, since this was a panel discussion, no one had needed much advance prep, and this wing-it approach was very evident  throughout.  In the end it seemed to me that any half-dozen forty-something IT pros randomly recruited from the audience could have said much the same things.  For next year, I’d like something more substantive for this second half.

In retrospect, I was surprised that no one had made real suggestions for further study, such as Fortune, Harvard Business Review, soft skills classes, David D’Alessandro ‘s excellent Career Warfare (I’ll be writing about this soon), or even something as simple as industry trade publications.

The professional development sessions were worthwhile, especially since aggressive, risk-taking, strategically-oriented behavior is  not the first notion generally associated with folks in our line of work.  In principle such sessions are very valuable.  Next year’s would continue  this great start by highlighting different skills, bringing in different speakers, pointing to more offline resources, and adopting a more structured approach.

Just got back from watching David Chappel’s COS202, Introducing the Windows Azure Platform, His presentation delivers.  In 75 minutes, he gives a quick, comprehensive, strategic overview of the major elements and motivations in the Azure platform.  Clearly shows how the major elements are envisioned, and the new opportunities inherent in this.  In addition, he also spent time outlining the varieties of applications/systems which would thrive in a cloud environment, and also explored the business considerations. If anything I would like to have seen more on this, since many IT pros are oblivious to it.  Chappel himself is an exceptional presenter, and spoke with confidence and enthusiasmthroughout.  This is a time of change and a time of opportunity, and as he pointed out, a great time to be alive.

Definitely download and watch.  It will not help you study for the inevitable Azure Cert exam, but will help you understand the dimensions and the potentials of this major, disruptive new technology.

Tech-Ed-2008-Deploying MOSS with End-to-End Kerberos Authentication, Matt Munslow

In this 2008 Europe presentation, Munslow does a fair job of explaining material occasionally seemed complicated even to him.  Not brilliant, mind you, but fair.  He needs to work on posture; he tended to rock right and left like a metronome.  And the video reproduction is lamentable; the fuzzy quality of the close-ups makes the demos nearly unusable.  Nonetheless, there is some good insight about Kerberos and troubleshooting that do much to make this worth the effort.  This 400-level presentation from Europe in 2008 gets very serious very fast.  Note that this presentation focused on SharePoint 2007.    It retains value even in an SP2010 world due to its methodology and Kerberos explanations.

Munslow focuses on three logical blocks of material:  Concepts, Configuration, and Troubleshooting.  

First, Concepts.  How does Kerberos work, and how does this affect our MOSS implementations?  Munslow focuses on mechanisms within Kerberos which are germane to implementing SharePoint 2007.

1.  Service Principal Names.  Everything in a Kerberos realm (or AD domain) has a unique service principal name, and getting our system to work with it relies completely on correct name configuration.  Clients use these service principle names to identify and authenticate with services like SharePoint.  He gave an example of what these look like:

HOST/server (“server” means the NetBIOS name for the server) and HTTP/www.contoso.com

“HOST” is one class of service principal name.  So is HTTP. 

2.  Keys.  Used to encrypt data; these are normally derived from hashed passwords.  There are two types- the long-term key, and the session key.  All domain objects have a long-term, master key, for authentication. 

3.  Tickets.  Data Encrypted with keys, used by clients to authenticate against services.

4.  Key Distribution Center.  Managed by AD and the Key Distribution Center (KDC).  This is comprised of a Database of objects and their related keys, an Authentication Service (initiates communication between client and AD) to provide ticket-granting-tickets which are used by clients to engage the third component of all this: the Ticket Granting Service.  This is the source of the tickets used to authenticate against SharePoint.   

5.  Sub-Protocols.  There are three exchanges related to this.  First is the Authentication Service Exchange, used by clients to get a ticket for tickets.  Then there is the Ticket-Granting Exchange, which is the issuance of a ticket to the client.  Finally we have the Client/Service Exchange, in which the client authenticates against SharePoint.    Each represents an opportunity for things to go awry, and Munslow wisely recommends understanding them for help in troubleshooting. 

Next, let’s review the authentication process and how it works.  Here we see there are four steps.  First, a client negotiates with an IIS server running SharePoint.  An anonymous request is denied, but the denial includes information on how the client can negotiate access.  Second, the client requests a ticket-granting ticket.  This is done by the client seeking such a ticket from an AD server.  This request relies on recognized client ID information and a “nonce” (number used once) value.  AD replies with a session key that allows the client to get other keys; this key is encrypted with the AD master key, serving to validate the request with other servers.  The nonce is also included, which serves to verify to the client that this ticket is from the server it originally contacted. And the client’s own master key can decrypt all this information.  Now the client has a ticket-granting ticket and a session key.  In Step Three, the client relies on these to get a ticket for access to SharePoint.  The client asks for a ticket for access to SharePoint.  The ticket-granting AD server gives this in a key encrypted with a session key.  Note that the AD does not have a copy of the session key.  It creates it and sends it, but does not retain it.  Thus the ticket-granting ticket is crucial.  This leads to a session key encrypted with a session key specific to the SharePoint service, and which will work, of course, only on the target server.  Now the client has what it needs to access the SharePoint server. 

Now, in Step Four, the client uses these credentials to contact the SharePoint server.  It sends the authenticator information again, with the service ticket provided by AD.  The service on the SharePoint server uses its own session key to unlock the ticket from the client, discovering all is proper.  Then it opens a session with the client. 

This is the theory and the concepts behind Kerberos with SharePoint; they should be familiar to anyone who has studied PKI.  Next, Munslow explains the Configuration process.    

Second, Configuration.

Munslow presents a fairly generic customer topology: 3 load-balanced WFEs (Web Front Ends), 3 application servers (one index and two query), and a backend cluster (2 active and 1 passive).  He sought to reproduce this in VMware, kerberizing it along the way.  First they did the web applications.    Kerberizing them required creating a domain service account, and using this account to create the web application; this is the account the application uses to represent itself.    Next, DNS entries were created, but not with CNAMEs.  Instead, it was necessary to rely on A records, which Munslow pointed out is non-intuitive.  In Step Three, they created SPNs (Service Principal Names).  They utilized a tool from the Windows 2003 Server Resource Kit, called setspn.exe.    This creates SPNs for the NetBIOS name of the service, and for the FQDN.     Fourth, for the web application, you configure it in SharePoint Central Administration to use Kerberos. 

Now, this was all in reference to a web app.  There can be complications with this, since a non-default port is used; during installation, a non-default port is used, or a random one is assigned.  Also, issues can arise from the fact that IE 6 (used on Windows XP) does not properly construct the SPN.  The non-default port would be left off.  Microsoft has a fix for IE that corrects this through a registry change.  Once done, things started to work.  Note that the use of W2K3S or W3K8S does not affect this, since it is, after all, a client issue. 

A demo followed.   Unfortunately the screen was too poorly recorded for this to be usable at all.  Even maximizing the screen was useless for rendering this intelligible. In light of the fact that several cameras were used to record this session, it’s quite surprising that the end result was such a disappointment.   Generally, though, working within the Central Administration to enable Kerberos was straightforward.  As straightforward as a topic like this can be, anyway.

Shared Services. 

After the web apps have been kerberized as show above, the next stage is to kerberize the shared services provider infrastructure, which is done by this clear but cumbersome stsadm command:

Stsadm.exe –o setsharedwebserviceauthn -negotiate

This does the same thing for the shared service provide which we did for the web apps.  However, Munslow discovered when this was done, search settings would break, with seemingly random error messages. 

Issue #1.  Munslow describes the torturous process of troubleshooting this completely undocumented problem.  He found enlightenment in the shared service provider infrastructure, specifically in the Office Server Web Services Web Application, which is created when the farm is created.  It runs with the application port called the Office Server Application Port and with the identity of Network Service.  Note that this service runs on non-default ports: 56737 and (secure) 56738.   Every time a shared services provider is created, it created a virtual directory in that web app with its own application port.  This runs as the identity of the account used to create the SSP, that is, the SSP service account.  The other important thing to note here is that it is the .NET client which uses the web application to communicate with other SharePoint servers.  This is how inter-server communication between SharePoint servers takes place. 

The thing is, the .NET client cannot use non-default ports to bind to the server. 

Issue #2.  Another problem occurs when you have more than one Shared Services Provider.  This is common when using Excel Services.  Here the difficulty occurs when kerberizing the services.  The system’s first impulse is to use the same name (albeit with a different user) for each, which is not allowed, and will lead to errors, due to having more than one SPN. 

Issue #3.  Yet another, and more widely encountered, issue pertains to the indexer, which cannot crawl Kerberos web applications on non-default ports.  To overcome this, an NTLM web application had to be created and extended as a Kerberos.  This the crawler would use the NTLM version, and users would use the Kerberos version.  Munslow’s group found that a crawler could crawl a Kerberos application as long as it was on a default port, such as 80 or 443.  Otherwise it would not work.  His work-around was to avoid using any non-default ports for any crawling web app.  No more elegant solution for this currently exists. 

For Munslow’s group, the interim solution for the SharePoint group was to discontinue the use of Kerberos.  This was disappointing but practical.    More than half a year after Munslow left this effort, a new format for the SPN was introduced which overcame this:  the MSSP.

The MSSP requires the creation of two SPNs for every server in the farm which is participating in shared services, which means one for the non-secure port 56737 and one for the secure port 56738.  This allows you to name the shared services provider; thus solving many of the issues Munslow mentioned earlier, since it gets rid of the problem of duplicates.    How to make this work?  First, the infrastructure update must be run on all SharePoint servers.  Then a manual registry tweak must be run, adding a DWORD entry, set equal to 1, in this key:

HKLM\Software\Microsoft\OfficeServer\12.0\KerberosSpnFormat

After this, do a reboot. 

Then register two SPNs for every server involved with shared services, as noted. 

Finally, go to the command prompt, and run

Stsadm.exe –o setsharedwebserviceauthen –negotiate

That should do the trick.  A demo followed; it was partially intelligible.  

Next, Delegation

To begin with, note that this is only needed if SharePoint is to pass credentials.  It is not necessary for simply logging in to machines.  For services not hosted by the SharePoint server, such as RSS, for instance, we are confronted with the so-called “single-hop rule”, which is meant to stop man-in-the-middle attacks.  The goal here is to allow SharePoint to represent a client in approaching these other services.  This can be done as follows:

 1.  Register the appropriate SPNs for the service.

2.  The server which does the impersonation (or, more charitably, representation) must be trusted for delegation.  This can be done (easily) in Active Directory Users and Computers through a constrained delegation.  Then, also within ADUC, add the service account of the RSS server. 

3.  The impersonating service also must be trusted for delegation.  This account too is trusted for constrained delegation with the RSS account.

With this done, the user will be able to be on a SharePoint server and use an RSS web part on a SharePoint page. 

And that’s how delegation works.

Third, Troubleshooting. 

For me, this is always one of the best and most rewarding portions of any presentation, and Munslow culled through his own customer experience to assemble these observations for responding to Kerberos-related problems.

  1. Duplicate SPNs.  This is the first place to look.  As noted above, this can be a source of great mischief. 
  2. CNAMES in DNS.   The only workaround is using A records.
  3. Use NETMON on the wire between the client and ASD and client and service, and look for errors.  Also check the performance of sub-protocols. 
  4. Check sub-protocols.  See above.
  5. User Event Viewer.  If you do not see Kerberos 540 events in Event Viewer, then there’s a problem. 
  6. Turn on Kerberos Logging for more detailed logging.

Resources:

He also suggested some resources; perhaps the best of these were:

www.microsoft.com/Kerberos for general information on the installation process for Kerberos authentication.  Very thorough and insightful.

technet2.microsoft.com/windowsserver2008/en/library/0c591182-b5ec-4686-9209-10047b454dfa1033.mspx

Other resources were mainly MIT’s take on Kerberos and some RFCs.

So, what’s the value in his presentation?  Weakly recorded demos, weak body language in presentation, and some daunting material.  Not only that, the world is moving to SharePoint 2010.  True on all counts, yet despite all this, Munslow’s presentation merits some attention for two reasons:

1.  The whole world has not switched to SharePoint 2010 yet, and SharePoint 2007 will remain in use for many years.  Thus its Kerberos performance needs to be well-understood.

2.  The advice on troubleshooting is very good, and generally applicable for all versions of SharePoint.

When my former blogging platform inexplicably disabled line-skipping between paragraphs, it was time to go.  I moved to WordPress, so I greeted Jeff Siarto’s  book with excitement, because WordPress is powerful but occasionally non-intuitive.  He did not disappoint.

Head-First WordPress covers getting started with your blog and goes beyond the basics of adding entries, such as managing content and managing pages.   Siarto also covers cool things such as customizing your blog with video and audio, and adding contributors.  The book is ambitious, and early chapters assume you use a web server and a hosting service.  However even if you are but a simple blogger, there are good tips and help getting started, such as the adoption of themes and widgets.  I personally found some for the examples, such as adding textboxes, quite helpful, and have adopted them in my on blog, affinitymask.wordpress.com.

I liked the presentation style.  Siarto offers a “multisensory learning experience”.  This means teaching strategies such as creative redundancy, a conversational prose style, anecdotes, and humor, to teach through entertainment.  Not easy with most subjects, but it succeeds here.   Since I have little interest in learning HTML or PHP, the many pointers and notes were ideal.  Ready-use CSS code is also included, a godsend to those with no interest in learning it just to blog. 

Good for high-school and college web publishing classes, and for anyone wanting to learn independently.     Oriented more towards ambitious web publishing efforts than garden-variety blogs.  Avaiable through O’Reilly, at http://oreilly.com.

Normally, we don’t look to HR folks for insights about SharePoint.  And I’ll admit to feeling some inner resistance initially, and only went to this session because I was geeked out on earlier ones.  However, I was surprised that I discovered great value in this presentation from the 2009 SharePoint conference in Las Vegas:

  •  business motivations for technology change
  • good thoughts on implementing SharePoint
  • social changes impacting our corporate technological evolution

 

1.  The Story

Patricia Romeo explained the background: Deloitte HR had noticed many millennials leaving after less than a year, and discovered that younger people felt great difficulty in connecting with others, and never felt a sense of kinship around them.  This is especially true when compared with the vast (if shallow) networks they developed on FaceBook and similar sites.  As a result, Romeo started an effort to develop an internal equivalent which would permit people to connect in the course of their work, make a large organization feel smaller, and overcome deep silos. 

Her demonstration of the self-branding site was a typical, very straightforward SharePoint site, with business card information on the left (along with people she knows), “About Me” and interests/favorites in the middle, and resume/documents/publications in the right column.    The most curious elements to me were the many personalization ones, such as interests, sports teams, and even things to do in her city when visiting her office.    This reflects the millennial predisposition for mixing the professional and the personal in life.    In the case of D Street, a blog is also integrated; in the case of Deloitte, professional and personal topics are featured.  The end result has been 24,000 active profiles, 2000 blogs, and hundreds of visitors a day, and an application which is rivaled only by T&E. 

 

2.      The Technology

Rob Foster then came on explaining more of the infrastructure aspects of D Street.  In particular, he outlined the feeds from SAP that pre-populate “business card” components, and ways for users to incorporate and control feedback and guestbook inputs, delegations, etc.  Even Deloitte partners got involved in creating and editing their own profiles. 

The system was build to accommodate 50,000 users, <3 second load times, consistent look and feel across profiles, and ways to monitor/moderate use and misuse.  Their legal and risk departments were surprised to find that few issues emerged from the D Street implementation; it seems that the maturity and professionalism of their work force pleasantly surprised them.

Scalability.  He found SharePoint scaled very well for D Street- they found that two WFEs for 50,000 users using blogs, personal sites, and collaboration sites.  Similarly, for the Deloitte user base they found that a single indexing and a single search/query server were enough.  One major (and surprising) weakness here was Foster’s failure to include specs on the machines used for this: RAM, proc type and count, etc.  Points off for this.  One customization in their architecture reflected the large amounts of information users were posting.  This necessitated multiple content databases.    Foster gave a logical application view of the architecture, but the material was too difficult to read (even with a freezeframeable video!) and too little explained to make much sense.  This is very regrettable, since the material seemed very promising.  In fact, for the techs in the audience (probably 90%, conservatively), this was some of the most fascinating content.  It deserved more. 

The deployment was incremental, in groups of almost 10,000 users.  The standard version of SharePoint 2007 was used, and all profiles were provisioned within seven days (!). 

Mercury tools (Load-Runner) were used for load-testing, and showed very high load-test times; this was addressed through a caching layer over the top of the web parts.   Keeping environments in sync for the load-testing was also a challenge.   

Lessons Learned.  Foster spoke to lessons learned from all this, post-rollout.  One, blogs should only be generated at user request, to avoid “dead air” in SharePoint and improve loading times.  Doing this reduced their indexing time from weeks to about three days.  Two, the site creation process is key to performance, stability, and maintenance.  Foster’s process was to create sites when accessed (i.e., on demand).    Also, the process no longer relies on site templates, leading to improvements both in site creation and subsequent performance.  Finally, they needed to improve the search to handle wild-card searches for people.    

Future Plans.  In the future, they plan to integrate D Street with the Deloitte internal portal.  Dashboards will allow them to personalize this for each employee; widgets can easily be added at will.    Social networking has also been added to make it easier to make connections throughout the organizations.   They also discussed the development of communities information, to create more levels for employees to network, and to even achieve business purposes on occasion. 

Audience questions started with the toughest: How does this add to productivity?  Romeo took this one and said that it was comparable to email.  No one can imagine not having email now, but what numbers could be provided to prove this?   

Conclusion:  Overall, this seemed like a fairly generic implementation of SharePoint 2010, and it’s not intuitive why this was rated a 300-level session.  But that’s not a slam; the presentation was noteworthy for its exploration of business imperatives at work in technology design and implementation, and especially for the way that larger social trends inform business technological change.

Even though this Tech-Ed 2008 presentation by Mike Watson, a Microsoft Technology Architect, is from 2008, there are many valid concepts here which will remain valuable far into the future.  Listening to this, it seemed to me that much of the HA/DR information here is fairly generic, with relatively little which was completely specific to SharePoint.  Nonetheless, Watson shows how these are excellent principles, and worth mastering, since they will not only help with SharePoint but SQL and other high-criticality systems as well.

1.      Measuring Availability:  “It’s all about the “Nines” 

Mike explains what various nines mean.  So, for example, 99% uptime means 87 hours of downtime permissible a year.  99.9% cuts this to just nine hours a year.  Not bad, most of us would agree.  But then, going to four nines, we see less than one hour allowed per year.  The famous fifth nine, 99.999% uptimes, means that every year a mere downtime of less than five minutes can be allowed. 

In practical terms, three nines is about all we can realistically achieve.   Here’s why.  Just five minutes in overages for maintenance can break five nines.  And a SQL cluster failover takes up between 5 and 20 minutes.  For patches, there may be four hours a month.  So, obsessing over five nines seems scarcely realistic and an unproductive diversion of resources.  Watson’s first task is to establish realistic expectations.

 2.     So, what are the most likely outages you will experience?

In Microsoft’s experience, 10% of downtime results from botched maintenance.  Some 20% came from web server capacity issues.  Another 30% was due to poorly maintained storage affecting SQL.    Hardware failures account for 20%, and everything else under the sun represents 20%.    Mike presented some best practices which have emerged from MS’s own experience with its own systems. 

 2.1       How to prevent botched maintenance?  This is an example of the small thing becoming a big hassle, and Watson’s suggestions are to avoid mistakes:

  •  Never take maintenance for granted or trivialize it.
  • Use a realistic test environment.
  • Minimize changes, or patch testing will become a full-time job.
  • Understand and own your system dependencies.

2.2       Automate standard issue responses.  Mike gave an example of ten lines of C# code which could automatically remove an ailing server from an NLB. 

2.3       Prevent web/app server capacity issues by always using at least three servers, planning for peak capacity, and considering the performance impact of background services.  And whenever possible, automate failover.    Avoid being stingy when provisioning web servers. 

2.4       Prevent SQL capacity issues through aggressive, generous growth projections.  Pregrow databases to a comfortable scale, say 100 gigs.  As soon as SQL reaches 75%, immediately start adding storage or scale.    Beyond this, optimize your performance by using 64-bit, and planning disk IO carefully; TechNet has excellent resources on this.  One example is his recommendation of 2 IOPs/gig set for temp, logging, and search.  For data, 1 IOP/gig. 

2.5       Specifically for SharePoint Online, he has this recommendation for server design: 32 gigs of RAM, 5 TB of storage, redundant power supplies, redundant fans, and two NICs, with two 2.33 GHz quad cores for the CPU.  These are really cheap, he says!  And that was in 2008!  Now you can probably get one with a Big Mac.

2.6       Preventing hardware failure is a crucial skill for all of us, and he has some specific ideas.  First, never use end-of-life or older machinery for production.  Second, since servers are cheap, buy redundant components.  Third, scale out with multiple servers to avoid single points of failure.  Fourth, for all OS and App drives, use RAID 1.  Fifth, and finally, follow vendor recommendations for maintenance and ensure the presence of redundant power through generators or batteries.   

2.7       Watson makes a very good point about dependencies.  Understand them, and arrange your SLAs so that your success does not become hostage to someone else.  “Trust but verify” warns our presenter.

3.       Design for Availability 

In principle, there’s no upper limit to the number of servers a farm can encompass.  That said, there are guidelines.  For example, web servers should start with three, to guarantee access and failover.  Multiple instances of search should be available, both for failover and speed of response.    Thus, for an organization with <50,000 users, Mike advocates for three web/query servers, and a single index/excel server.  For up to 75,000 users, he recommends four web/query servers, and two index/excel servers.  For up to 100,000 users, he proposes one more of each. 

In contrast, he presented a map of MSIT SharePoint.  This was vastly more complicated, with several web portals, each with its own NLBed web servers.  NLBed SQL servers went up to 8 in number, offering up to 6 terabytes of storage.  The largest unit is the EMEA SharePoint, which includes three servers in the Web/Excel/Query role, one in the Search Target role, one in the Index role, and 8 clustered (active/passive) SQL servers offering 6 TB of storage, 

In such large farms, database mirroring is a vital component of HA, and Mike covers it.  In this system, two (or more) databases are linked by encrypted channels, comprising a Principal and a Mirror.  They are in turn monitored by a “witness” server which might be running SQL as simple as SQL Express (though this one is not recommended).    This latter server monitors the linked ones, and when the Principal is offline, it designates the Mirror as the new Principal.  Note that this automatic failover is an option for the SQL servers, but not for the SharePoint farm servers.  SQL 2005 SP1+ can be used for this, Mike recommends Enterprise because it provides for multi-threaded re-do queues.  This allows for better scalability and speed for applying transactions in SQL mirroring. 

 Watson covered three types of database mirroring.

 -      High Protection: synchronous.  Failover is manual, and transaction writes are synchronized on both servers.  Here there is a low tolerance for latency and performance weakness. 

-      High Availability:  This is the same as High Protection, but with the addition of a Witness server which manages any failover.  This is also synchronous.

-      High Performance:  Writes are not synchronized on both servers; here there is an assumption that everything will be completed successfully on the mirror.  Unlike High Protection, the High Performance mode can be tolerant of latency and low bandwidth.  Unlike the first two, this mode is referred to as asynchronous.

The first two are a local strategy.  The third mode is more suited to remote relationships, as we might expect from the tolerance for latency. 

Several things should be noted here.  SharePoint is not mirroring-aware.  This means that in cases of failover, the SharePoint Farm Servers need to be manually notified, as by use of the command stsadm –o renameserver oldservername newservername.  However, this command is not recognized everywhere within the SharePoint stack.  This means that another technology is needed to make this work. 

A SQL Connection Alias can also make a difference here.  Thus on an application server, you can define an alternate name for a SQL node.   So on your application server you configure an app to use one server, you can configure an alias which points to another server.  Then on failover, the alias is updated to the second server.    Watson then presented a detailed comparison of failover clustering and SQL High Availability Mirroring.    Some key points were that SQL HA Mirroring protects against failed storage, can leverage cheap DAS storage, allows up to 1 ms latency, and accommodates easier implementation of patches and upgrades.    Clustering, on the other hand, is automatically self-correcting, has a simpler recovery model, claims no performance overhead, and represents a minimal operational burden.  More could be said, but Watson begged off, referring to “a long night last night”.  What? 

His next slide helped to make up for this, describing a “Really Good H/A Solution” of active/active principal/secondary sites.  In this case you have separate facilities, distant by no more than a millisecond of latency, and with LAN-like bandwidth, a gigabit.  Both sites boast the SharePoint web/apps server, with the SQL backend of a principal and a mirror and a SQL witness at the main site.    In this situation, SharePoint/App servers at both locations both work with the principal’s SQL, which in turn mirrors to the secondary’s SQL.  In the event that the primary SQL goes down, the SQL witness redirects traffic to the secondary, automatically exchanging their roles.  The main vulnerability of this arrangement is some catastrophe which engulfs both facilities, such as blackouts, volcanoes, floods, hurricanes, tornadoes, or military activity.    Watson towards the end showed a more long-distance arrangement designed to overcome such catastrophes, involving local mirroring for high availability, and log shipping across the country for DR. 

The presentation was not perfect.  For instance, it would have been worthwhile at this point to have a discussion of the business reasons for choosing one or the other HA/DR options, since in most organizations technical considerations alone should not drive the process.  And it can only be disheartening to hear your speaker confess to “a long night”.  Nonetheless, this presentation was a good survey of the basic issues, and is a good introduction to HA and DR for any version of SharePoint.

This Kimmo Forss session from the 2009 SharePoint conference in Las Vegas explored principles and best practices for migrating external content from third-party systems to SharePoint 2010.    Forss seemed uncomfortable as a speaker, delivered stories anemically, and exhibited newb public speaker weaknesses, but the content substantially redeemed this.

Forss starts out discussing the problems of moving and comparing this with moving content to SharePoint.  Perhaps due to nervousness, he starts with a long anecdote and discusses Sun Tzu’s admonition to “know yourself” before getting to his three-part topic:

  • Planning the migration and analyzing the applications.
  • Choosing a migration approach.
  • Choosing the right tools.

1.    Plan and plan again

What are the keys to success?  People who know the source and target environment are indispensable.  Moreover, skill in knowledge management and development of the application to be moved are vital.  Realistic (i.e., long) timelines are a sine qua non.  And finally, the support of a capable steering committee can make all the difference in the world. 

Forss presents this high-level plan:

  1. Analyze the existing environment.
  2. Conduct a gap analysis.
  3. Plan and build the new environment.
  4. Export data from source (can be done with tools).
  5. Map identities and metadata (optional to Forss).
  6. Import data (can be done with tools).
  7. Manual cleanup and user communication.

Application Analysis means reviewing the content to be migrated into SharePoint and determining its requirements.  The less complex the better; custom applications with complicated processes (integrations, complex workflows, and custom code, for example) represent the high end of difficulty, and out-of-the-box apps with simple data (tables, forms, and lists) are the easiest.  Placing your application along these two axes (custom/out-of-the-box and data/process) will help to determine this.  In a nutshell, migrating content is easy, migrating applications is hard.  Forss recommends a best practice of creating a new application to meet today’s needs rather than trying to shoehorn legacy code into a new environment.  Obviously this can lead to intricate nightmares. 

File Shares.  Migrating content from file shares can be easy or hard, depending on several factors:

  1. Easy: simply files to be moved, with no links.
  2. Medium:  there are blocked file types and large files. 
  3. Hard:  ACLs and inheritance are involved, along with deep hierarchies.  This can be difficult to exactly replicate. 

 Non-technical considerations can also impact your plan.  When files are uploaded to SharePoint, their timestamp becomes that of when they were uploaded.  This can be vexing when prior timestamps must be maintained for legal, auditing, or other reasons.  Forss did not really present solutions for these.  The complexities of document formats, such as files containing embedded files, can also require technical ingenuity. 

More specifically, Forss spent some time on importing material from Lotus Domino.  Here he sees three levels of difficulty:

  1. Minor effort: for form-based applications.  Data is simply entered, and this is simply replicated. 
  2. Medium effort:  template-based applications with corresponding SharePoint functionalties.  Links from one document to another, are an example of this.  These need to be (usually) manually recreated. 
  3. Sizeable effort:  This refers to complex applications with backends, long-running migrations, and needs for coexistence.  This refers to apps which before and after having been moved still have dependencies on other applications which have not been moved yet. 

Going from SharePoint 2003 to SharePoint 2010, things can be less complicated.  The OOB lists and libraries present little difficulty.  Some effort is needed to move over even simple customizations and list templates have to be mapped to content types.  Changing site definitions is more complicated, as is rearranging site structures, and cases where coexistence must be maintained.  The quality of your information architecture (seldom world-class in Forss’ view) can also impact this. 

Finally, Forss covers the optimal scenario, moving content from SharePoint 2007 to 2010.    As with 2003, OOB lists and libraries are straightforward.  Customizations can remain an effort.  In terms of sizeable efforts, Forss sees three areas of concern:  changing site definitions, structures which need to be re-arranged, and cases where coexistence of dependencies must be maintained. 

Finally, here Forss reviewed moving content from other WCM (Web Content Management) platforms.  Simple pages and collections of images present little difficulty.  Medium effort goes into transferring page layouts and links.  Forss mentions business needs can complicates this but did not explain how.  The most difficult category encompasses such knotty issues as navigation, web parts, re-arrangement of structures, and re-useable assets.    Forss said little beyond this, which is regrettable, since the most difficult tasks would be expected to receive the most coverage.  More generally, his approach in these sections gives us little to do to mitigate difficulties, and is a genuine disappointment here in this presentation.

2.   The Right Migration Approach

The Migration Process is next.  Forss suggests two styles:  direct and staged.  A direct migration means the data is moved directly to production; he gives email as an example of this.  The staged migration involves separating the export of data from the import of this data.  He also discussed the self-service migration, where the migration is done manually, perhaps with tools.  At the same time, there can be complex migrations involving much custom code; here Forss says simply “invoke experts”. 

After this survey, Forss presented a demo of a Domino application with customer data being migrated to SharePoint online.  For this he relied on Access with an XML import.  This represented getting the data out of Domino.  Then he moved it into SharePoint online.  This was a smooth success.  He caveated that some things, such as the timestamps for the data, were lost. 

3.   The Right Tools for the Task

The need for tools can easily arise, and Forss discussed this.  No single one stands out as definitive, and instead he recommended common-sense things to look for: the feature set, its server footprint, the licensing cost and style, and whether it operates standalone or needs consulting as well.    Some such tools are community-based, and Forss demoed a couple he himself had written, referring us to his site at www.codeplex.com/SPMigration to learn more.  He then demoed the process for a migration from a SharePoint 2007 content site to SharePoint 2010 using one such tool.  He showed the ability to export a set of sites to a file-share which could in turn be imported.  The supple options of this tool within SharePoint 2007 were shown to good effect.    He also presented something of the history of such tools, going back to a joint venture with HP to migrate from SPS 2001 to SPS 2003.    Other comments regarding tools and extractors were less clear, to me at least.  The demos were interesting and certainly filled out the last half of the presentation, but with nothing more to rely on than slowing the video to copy the exact process, anyone would have trouble replicating their results.  And I have to admit, after watching them a couple of times, their overall direction was not very clear, and Forss himself apologized for jumping all over the place.  Who am I to disagree?  Regrettably, this last half of the presentation is a jumble, and just after saying he had seven minutes left, Forss abruptly brings his presentation to a close.    

In conclusion, Forss needs presentation coaching.  That being said, there are some good ideas here for all of us seeking to move content into SharePoint 2010.  The demos are unlikely to stay in your memory for very long, but the principles are.

This session from the 2009 SharePoint conference in Las Vegas presented much beta-build material insight.  That said, it does not have the depth from Peschke’s presentation; I summarize it because it nonetheless still had some good examples of analysis, and also highlights some key advances for the current version of SharePoint. 

Zohar Raz and Kfir Ami-ad’s agenda covered four topics- “The Challenge”, performance improvements in SP-2010, its capacity management approach, and capacity guidance.

Raz first briefly outlined some of the key differences with the new version: WSS renamed to SharePoint Foundation with some new components added (sandboxed code service and usage and health logging), and the SharePoint service applications greatly expanded by the addition of items such as the PowerPoint broadcast service, the Visio Graphics Service, and PerformancePoint.    On the client side, interactivity with Visio and Access has been added.    It goes without saying that all these services can be customized, with as many or as few being offered as desired.  

So, with all this richness of services, what’s the challenge that Raz mentioned?  Resources: the WFE and App servers have more work, the SQL servers have more work, and the client browser has more work. 

What performance improvements in SharePoint respond to this?  Raz identifies four:

Latency.  Screen response time within user expectations.

Throughput.  Achieving speed for the required number of users.

Capacity.  Response time should be adequate for the required number of items in the SQL database.

Reliability.  Making certain that the required materials are available for the required users. 

Latency Improvements.  Most frequently-used pages are lighter and faster.  IE8 is better for WAN operations.  Javascript executes after a page is loaded, allowing for faster execution.  Pages render as bytes arrive- fewer blank waiting pages.  Here’s an example of a scenario- file opening and saving.  A new protocol- Cobalt- allows for better file uploads and downloads; now they can take place in the background.  It also permits incremental save for files, so complete file downloads are no longer needed when files change. 

Data Scale Improvements have also been made.  Tens of millions of items can now be placed in a single list.  Now 5000 items at a time can be viewed or queried.  Whereas in SP-2007, 100 gigs was the max size for site collection or content databases, now this can be exceeded.  Nonetheless the recommendation is to aggregate your collections/content in databases of 100 gigs or less.  Otherwise, performance can suffer, and hardware must be apportioned appropriately, especially with parallelized disks in your SAN to optimally support SQL. 

Throughout and Reliability Improvements.  This is based on support experience with SP-2007 which led to ways to throttle excessive client loads.  Large List Throttling is part of this: single user operations can be prevented from monopolizing farm search/list resources for tasks such as large-scale deletes, reducing latency spikes.  Such extensive operations monopolize the database, harming performance for others.  Now these operations are throttled when standard users perform them, preserving performance and access for others.  A demo showed the deletion of 300,000 items with no impact whatsoever on performance.  Quite impressive.  That being said, if a farm admin does this, action is direct and unthrottled. 

Another innovation is throttling excessive client load from client apps.  In this, the web front ends can tell the clients they are busy and direct that sync frequency is lengthened.  In its turn the server throttles low-priority requests when it is overloaded, resulting in 503 messages for them. 

Capacity Management: some new features in 2010, allowing a more dynamic, iterative approach.  In this, modeling and study are followed by prototyping and pre-production with load simulation.  Then the hardware selection grows from this.  Next is deployment, followed by production monitoring and ongoing analysis.    Raz likens this to the cockpit of an aircraft.  One resource is the logging database, functioning much as does an airplane’s black box.  This allows correlation between incoming requests and individual server performance among the WFEs.  A demo showed the wide, deep variety of these logs, including such items as “CrawlWorkerTimings”, “PerMinuteTotalUIQueryLatency”, and “SQLMemoryQueries”.  As Raz observed “The logging database is a goldmine”.    Another resource is the developer dashboard (off by default and enabled by farm admins), which allows you to from the browser select sections of code and analyze their impact on server performance, with separate calculations for the WFE, SQL, memory, and web parts.    This helps to identify the top performance offenders, making it easier to isolate and improve. 

Capacity Guidance and Scaling.   Very explicitly not a cookbook.  Instead, Raz is looking for the general principles of deployment success and architecture.  Much here is familiar but worth repeating:  the single box sandbox/dev environment.  The first “interesting” farm splits SQL off onto its own machine, retaining the WFE and app server functions separately, handling perhaps 5 RFS (requests per second).    Third is the medium farm for several tens of thousands of users.  Here there’s a load-balanced SQL, several app servers, and several WFEs, able to accommodate 50 RFS and several terabytes of data.    Next step up is the “big” farm, with more of everything, but with the same basic tier structure as the medium farm, handling 500 RFS and 10-20 terabytes of data.  Here it starts to make sense to split the farm up into federated farms, perhaps one for search and one for social networking, etc. 

Next Raz showed the MS environment for two departmental portals handling 4 web apps, a 70-disk SAN, 15,000 users and 7000 MySites, handling 7,000,000 requests a day, or 150 RFS.  Obviously this is an illustration, not a prescription, but it was enlightening to see the principles of architecture and usage here. 

Raz concluded with some early beta guidance.  This was less relevant in a post-beta world. 

Conclusion:  Good principles of capacity planning and how SharePoint works autonomously to maintain its availability, and insights into how to maximize the superior monitoring resources of SharePoint 2010.  Worth seeking out.    Additionally, there were very good thoughts on topology and architecture; I wish these had been explained in much more detail.

Follow

Get every new post delivered to your Inbox.