Chapter 1. Content and Documentum
Every single bit of information seen on a website can be classified as content be it text, graphics, rich media, video, engineering drawings, XML, images, scanned files—just about anything and everything!
Content can be of various kinds, from pure textual pages to training material, online reference manuals, graphical screenshots and even complex data graphs.
One of the simplest ways to describe content management would be through the example of a daily newspaper website. Most of us start off our day browsing through our favorite newspaper edition (be it the conventional hard copy or the online version). Have you noticed something in particular about most newspapers? The structure or layout of most of the sections in the newspaper remains constant everyday. What typically changes is the actual content within the same sections on a daily basis.
The layout of the headlines remains constant—though the actual headlines change everyday. Sections like cartoons, the editorial corner, and weather report maintain the same look-and-feel everyday but their content changes everyday with the latest edition of the newspaper.
The online version of the newspaper needs to be updated every day with the new HTML, graphics, and text depending on the news. Imagine the time it would take to update the website's HTML/JSP pages manually every day to reflect the latest news. This would cause an increased dependence on the technical web developers to update the content. Updating several hundreds of HTML pages every day would also cause a time and resource problem.
Additionally it would mean technical web developers dealing with content they don't even understand and yet had to safely upload within the security boundaries of the organization. The editorial staff and content contributors/authors would have to rely on the IT staff every day so that their content could make its way to the actual website.
The problems multiply since the IT staff turnover is extremely high in most organizations—imagine having to recruit new web developers on a periodic basis to maintain live websites. Moreover, what if the page updates take a substantially long time—so much so that by the time the updated content shows up on the website, it's too late and practically stale!
The current business circumstances require immediate and correct data to be up 24/7 on the organization's websites. A lackadaisical attitude can literally throw a business out of the current market space. The problems of managing content on websites will keep on growing with time because of the increased visibility of websites today.
It is easy to understand now the need for an effective content management methodology that can result in:
Decreased dependence on IT staff to run and maintain the core business
Reduction in cost and better ROI to maintain the core business
Non-technical contributors maintaining their business website all by themselves
Not having the non-IT staff learn Internet web technologies like HTML, JavaScript, JSP, etc. to run the core business
Always having the most up-to-date information available on the business website without unnecessary delays
Security mechanisms restricting the editing of information by unrelated business divisions, for example, restricting the editing of sensitive financial information to the administrative department
Automation of content creation/approval/publishing through a workflow mechanism
Reduced expenses in maintaining hardcopy versions of documents/manuals/content
Rollback mechanisms in case the updated content needs to be pulled off the website
Effective capture and use of content metadata for indexing and searching
This list is not complete—the virtues of having a good content management methodology are many and varied. The above list simply gives us an idea about the criticality of content management in today's demanding business space.
In a nutshell, what exactly is content management? One of the numerous available websites on content management describes content management as follows:
Content management is the organizing, categorizing, and structuring of information resources (text, images, documents, etc.) so that they can be stored, published, and edited with ease and flexibility. A content management system (CMS) is used to collect, manage, and publish content, storing the content either as components or whole documents, while maintaining dynamic links between components.
Figure 1.1 represents the conventional process of creating content for a website, getting it approved by a sequence of business users and finally having the web developer (IT staff) update the HTML pages to reflect this approved content.
However, this method is not without its drawbacks. It is a time consuming process to author content and get it manually reviewed and approved by a string of business users and then a heavy dependency on the IT staff to make the changes manually in website pages. By the time the sequence of steps gets completed, the content is probably stale and is no longer appropriate to show up on the organization's website!
1.1 Need for an Effective CMS
Most of the above mentioned problems with content management can be solved by using a content management system (CMS). A good CMS allows the content authors to create content in the form of articles through some pre-defined templates. The content author simply needs to provide content (plain text, pictures, etc.) in the template fields. The content management system then uses some pre-defined rules to style the article, thus separating the actual content from its display/layout structure. The author needs to be concerned only about the core content and not about its look-and-feel and formatting, thus saving loads of time and pain. Some content management systems also optionally require the author to enter metadata for content, for example creator name, keywords, etc. so that these can be associated with the content and be used for indexing and searching the website.
Unlike the traditional content management approach of an author manually getting the content/ articles approved by editors and senior members from business content approval divisions, a good CMS has an automated workflow mechanism. The author simply specifies the sequence of approvers to get the article approved and the automatic workflow does the rest of the work. It ensures that the content does not get published to the website until and unless the sequence of editors and approvers approve it via the automated workflow.
This requires the IT staff (web developers) to prepare the templates and associated rules as a one-time activity, along with stylesheets that format the entered content articles and are responsible for the look-and-feel of the website.
The IT staff additionally needs to configure and establish the CMS software once and from then onwards the content authors simply use the system and templates, getting rid of future dependency on web developers.
Figure 1.2 simply gives a graphical perspective to the benefits of using a CMS.
The one-time effort that a web developer puts in creating templates/rules so that later content creators can use it going forward is a good money-saving approach.
The automated workflow available in a CMS routes the content through its different lifecycle stages finally getting it approved and publishing it to the business website.