Now that you’re aware of the technical underpinnings of SharePoint’s storage, and of some of the general directions that a solution might take, let’s lay out business goals for SharePoint storage. Some of these might seem contradictory at this point, but that’s okay -- we’re after the “perfect world” set of capabilities.
Keep in mind that these goals apply, potentially, to all the shared and collaborative data in your environment. Right now, it might not all be in SharePoint. Whether it is, isn’t, or should or should not be is not the consideration right now. SharePoint is a means, not a goal in and of itself. What we’re going to review now are our goals, and we’ll determine later whether SharePoint can be made to meet these goals.
As I wrote earlier, there are some advantages to keeping content elsewhere, especially off-site. We want to be able to include content in SharePoint regardless of where the content is located, and in some cases -- as I’ll outline in a bit -- we may have good business reasons for not migrating that content into the SharePoint database.
We want all of our content to be alert-enabled. Alerts provide a way for users to “subscribe” to an individual document and to be notified of any changes to it. This might allow a document owner, for example, to be notified when someone else has made changes to a document; it might allow users who rely on a document -- such as current sales specials or personnel policies -- to be notified when changes have been made that they ought to review and become familiar with.
This is something that SharePoint offers, but we need to figure out whether we can practically and affordably include all of our content in SharePoint. Ideally, we do want all of our content included in SharePoint in some way so that any piece of content can be subscribed for alerts.
Metadata- and tagging-enabled content
SharePoint allows for content to have predefined and custom metadata attached to it, along with user-defined tags. Companies use these features to attach additional meaningful keywords to content, and to classify content. For example, companies might use metadata to identify a content item as “confidential” or to associate it with a particular project. Of course, the content has to live in SharePoint’s database in order for this support to exist -- but we want these features for all of our content.
These feature in particular can make long-term content management easier, and can enable users to locate content more easily and quickly by using common keywords that might not appear within the content body (especially for media files like videos, which don’t have a written “body” for keywords to appear within).
We don’t necessarily want every single document modified by everyone in the environment, but we might be open to everyone suggesting changes. One way to achieve that is to apply workflow to our documents. Workflow enables a user to modify a document and submit it for approval, either to a group of reviewers or to a single reviewer. Before the modified document becomes the official “current” version, the modifications would have to be approved by some predetermined set of approvers.
SharePoint offers this functionality but only for content that resides within its database. In other words, we again need to see whether it’s practical and affordable to include all of our content inside SharePoint so that we can enable workflow on whatever pieces of content we feel require it.
Another SharePoint feature is the ability to keep past versions of documents. Unlike a tape backup or even Windows’ VSS, SharePoint doesn’t create a new version on a scheduled basis. Instead, it creates a new version of a document whenever someone modifies the previous version -- ensuring that we have every version of the document that ever existed, if desired. Users with appropriate permissions can access older versions of a document, compare it with other versions, and even make an older version the current, “official” version for other users to access.
In an ideal world, we’d have the option for versioning for every document in the entire enterprise -- but normally, SharePoint can only do this for documents that live within its repository. So once again, we need to decide whether we can afford to include everything within SharePoint.
Most of today’s data repositories support some kind of security. The Windows file system, for example, has a very granular security system. What would be nice is if we could manage access to all of our shared data in a single place. Obviously, SharePoint is a candidate to be that place because it too supports a robust and granular security system. In fact, because its security information lives in a database rather than being distributed across individual files and folders, SharePoint is arguably a better way to manage storage, offering the potential for easier security reporting, auditing, and so forth.
Again, however, we can only get those advantages if all of our content lives in the SharePoint database -- which may or may not be practical or affordable.
Indexed and searchable content
One of SharePoint’s biggest advantages is its ability to index the content in its database, and make that content searchable for your users. It’s like having your own private Google or Bing search engine that is accessible only to employees and that includes all of your enterprise data. SharePoint’s indexing system is security-aware, meaning it won’t show users search results for things they don’t have permission to access in the first place. Of course, in order to be indexed, SharePoint needs all your content to move into the database. Even if you’ve decided that you’ll pay whatever storage costs are needed to make that happen, there’s still the significant project of getting your data into that database.
Minimal database impact
Here’s where our business goals start to contradict each other. We want all of the above capabilities -- searching, security, alerts, workflow, and so on -- but we also want minimal impact on the SQL Server databases that support SharePoint. We want those databases to perform at maximum efficiency at all times, and we ideally want them to take up as little space as possible -- simply because “space” costs money to acquire and to protect and maintain.
Minimal WFE impact
I described earlier how streaming media files can sometimes have a negative impact on SharePoint’s WFE, so we want to avoid that impact. We still want our media files “in” SharePoint somehow -- so that they can be indexed, searched, and managed just like any other form of content -- but we don’t want to do so in a way that will create a burden for the WFE.
We also want all of the above goals without having to spend time and money migrating data into SharePoint. Although great migration tools exist, any migration project is a project. We might decide that some degree of content migration is acceptable, but we don’t want migration to be a hard-and-fast pre-requisite for gaining the above capabilities for all of our content.
In other words, we want to be able to use all SharePoint’s features without necessarily putting all of our content into SharePoint’s database.
Transparent to users
Based on the above, seemingly-conflicting goals, we’re likely going to be looking at some kind of hybridized system that involves SharePoint, SQL Server, perhaps some kind of BLOB offloading, and likely other techniques. With that possibility in mind, let’s make one last, formal business requirement of whatever solution we come up with: Our users can’t know.
The goal here is to get all of our content into a centralized SharePoint infrastructure so that our users can access all of their content in one consistent fashion. That’s SharePoint’s high-level vision, and we have to maintain it. We can’t start throwing wrenches into the system that require users to go here for some content, there for other content, and over there for still more content; it all needs to be in one place.
Whatever we’re doing to archive old content, for example, still has to make it look like that content still lives in SharePoint, even if it really doesn’t. This is perhaps our ultimate business goal, and any solution that doesn’t meet at least this goal is one that we will have to set aside as unsuitable.
|This chapter is an excerpt from the book, Intelligently Reducing SharePoint Costs Through Storage Optimization, authored by Don Jones, and published by Realtime Publishers, November 2010, ISBN 978-1-935581-25-3, Copyright 2010 by Realtime Publishers. Download the complete book for free at Realtime Nexus Digital Library.|