Individuals

What’s All This Talk About Intelligent Content?

The Content Wrangler - Thu, 2008-12-04 04:17

By Ann Rockley, The Rockley Group

Content has often been managed as documents. Metadata for search and retrieval has become more and more important as the amount of content has increased. In recent years with the increased interest in the use of Extensible Markup Language (XML) for content creation and the rising popularity of the Darwin Information Typing Architecture (DITA) XML standard, content has begun to move from unstructured documents to structured XML-based component-based content. And with the advent of XQuery, an XML query language that searches on the structure of content, then manipulates and renders content, we can do so much more than just full-text searching. We’ve gone from documents which are “black boxes” to intelligent content which is structurally rich and semantically aware, and is therefore automatically discoverable, reusable, reconfigurable and adaptable.

Let’s look at the definition for intelligent content in more detail:

Structurally Rich

This means that the content is structured content and more importantly it is semantically structured content, content where the structure has meaning. We could look at something as simple as a whitepaper which could include a structure like (executive summary, introduction, discussion and conclusion) or a marketing brochure that could include a structure like (positioning statement, value proposition, features and benefits). Or we could have content which follows a standard like DITA or DocBook. But each of these structures enable us to understand what content we have so that we could search within positioning statements only, or pull out that statement and use it in another type of content.

If we have a structure in our content we can manipulate it. For example we can automatically determine how to publish it to multiple channels (print, web, mobile) or we can filter out some content (e.g., tables may not work as well in the mobile environment). Also if it is structurally rich we can perform searches or narrow our search to the particular type of information we are interested in (e.g., look for all occurrences of the word like high definition in positioning statements).

When it is structurally rich we can do so much more.

Semantically Aware

The word semantic means “meaning”. Semantically aware content is content which has been tagged with metadata to identify the kind of content it is.  For example, you might tag your content with industry sector, role or audience, and product information. If it is tagged with semantic metadata it is possible to automatically build customized information sets based on audience or industry for example. As more organizations start to create personalized content (content which is dynamically assembled on demand that specifically matches a users need, behavior or user profile) this type of metadata becomes really important.

In addition, as content is pushed to wikis, integrated through mashups or pipes it becomes even more important to ensure that our content is semantically tagged. Without semantic metadata it is very difficult to automatically, let alone manually, find the content we need.

Discoverable

If the content has semantic tags and is structurally rich it is a whole lot easier to find exactly what we are looking for. And when it is structurally rich, and assuming our content is in XML, we can use XQuery to question the structure of the content to find specific information. Then when we add semantic tagging to the content we have a great deal of information that will allow us to zero in on exactly the content we are looking for. You’ve heard of data mining? Well, now we can do content mining! Sure, we could have done this with regular old unstructured content, but we would have to develop (and maintain) highly complex algorithms to interpret the content; with intelligent content, discovering relevant content is much much easier.

Reusable

Reusable content, content which is created once and used many times throughout an information set, has been used for years in technical documentation, but its popularity is quickly moving into business documents like marketing materials, proposals, contracts, and policies and procedures. We can create modular structured content that can either be easily retrieved for opportunistic reuse (manual reuse) or automatically retrieved for systematic reuse (automatic reuse).

Reconfigurable

Structured content is content separate from format, in other words the look-and-feel of the content is not embedded in the content. That makes it very powerful. Knowing the structure of the content, we can output it to multiple channels, reconfiguring it to best meet the needs of the channel, or we can automatically mix-and-match content to provide us with the information our customers need. We can even transform content (reconfigure it) from one structure to another (mobile online help, print publication, website knowledge center), but only if we know what the structure is in the first place.

Adaptable

We frequently create our content for a particular need or audience, but content can be adapted (used in a different way), often without our knowledge, to meet a new need. Think of mashups, we don’t know how our content is being aggregated, but it can be easily reused by both humans and computers because we have structured and tagged it intelligently.

A Few Scenarios

The following are a few scenarios that illustrate intelligent content.

Customization: Mobile Phones

A major cell phone manufacturer and distributor produces over 100 different phones. They range from simple handsets to highly capable models designed to support email, video, and conferencing applications. While each handset has a unique market position, there are numerous features that are common across handsets, for example, texting. The description of texting is the same no matter what model of phone we examine, but each handset may require different key sequences, images, key names, and so on. In addition to handset configurations, there are regional differences that determine the features a particular handset will support, as well as region-specific safety standards and language requirements. Some of the handsets are made available through specific carriers (mobile phone sales partners, for instance) requiring different contact information and branding. One component of information could have as many as 500 variations!

To reach their goal they make the content intelligent to facilitate automatic builds and content filtering by:

  • Creating structured component-based content
  • Creating semantic metadata to clearly identify what piece of content is appropriate in which situation
  • Creating rules which identify how each product configuration is to be built
  • Using variables for key sequences so that they can be automatically populated with the correct content at the time of publication
  • Using a component content management system to publish the content on demand based on the configuration rules

By utilizing intelligent content, they reduced translation costs, and optimized processes. And, they are actually making more money because they can now provide highly-relevant, personalized content, which has led to an increase sales.

Personalization of Healthcare Insurance Program

People have been talking about personalized content for years, but often back off because creating personalized content is a lot of work with traditional content. Not so with intelligent content.

A Health Maintenance Organization (HMO) provides health insurance for 100’s of companies with thousands of policyholders. The HMO builds a self-serve site to enable employees to review coverage, submit claims and get customized health and wellness information.

Personalization is supported by intelligent content through:

  • Development of personas to identify the key characteristics of customers
  • Policyholder login profiles that match the customer profiles
  • Structured component-based content
  • Semantic metadata used to tag content based on personas
  • Personalization rules to be applied based on profile and policyholder requests

Dynamic Delivery of Investment Information

A financial services company has been producing content for its investors for both the web and print. There are daily News and Notes to keep investors informed of key breaking news, weekly reports to summarize a an area of particular interest, monthly reports and annual reports. The reports have always been produced as PDF while the News and Notes are web-based. There is more than a decade worth of content squirreled-away on the web and in fileservers throughout the organization. This pattern of content delivery has been pretty effective until now. With the economic melt down, investors are clamoring for content daily—even hourly—and they want more than just the information the financial services company can provide.

To calm investors and to provide as broad a swath of high-quality content as possible for investors who have decided to stay in for the long haul, they decide to change their paradigm and offer personalized dynamic content delivered automatically to their investors. They already have much of their content in XML and what they don’t they decide to convert to XML. They also set up RSS and other feeds to ensure that content can be gathered from multiple sources, incorporated with their own information, and automatically delivered to investors.

They built a set of profiles of their investors and provide a simple way for investors to indicate the types of information they would like to receive. Some investors choose to see an historical perspective to see what has happened in previous periods of market instability, and all have asked for hourly updates. Investors can search based on specific criteria and assemble customized views of the information that meet their specific investment interests. New information coming into the investment services company is captured, converted to XML, searched and retrieved based on specific criteria, assembled and transformed, then delivered to investors.

Dynamic delivery is supported by intelligent content through:

  • Customer personas
  • Customer login
  • Structured content
  • Semantic metadata
  • XQuery and full text search for retrieval
  • Dynamic multichannel publishing
  • RSS

Who is Using Intelligent Content?

There are a number of industries that are making use of intelligent content. Companies whose product is content (publishing and media companies, for example) have begun to adopt intelligent content as a methodology for moving away from their traditional print products to a truly multi-channel, and often personalized content offering. Companies who produce huge volumes of content such as life sciences (pharmaceutical, medical device, and health and hospital organizations) and financial companies (insurance, investments, banking) use intelligent content to optimize access and retrieval. The high technology industry has been moving towards intelligent content for a number of years, but are not yet making a lot of use of metadata and personalization. Government is starting to use intelligent content to manage and deliver legislative content.

Benefits of Intelligent Content

There are many benefits of intelligent content. We can:

  • More easily find it
  • Deliver it
  • Customize it
  • Personalize it
  • Automatically deliver it to multiple channels
  • Simultaneously release content in multiple languages

And…

  • Reduce costs
  • Speed up delivery time
  • Optimize resources
  • Do more with the same resources
  • Increase customer satisfaction

Learning More About Intelligent Content

Creating intelligent content is an emerging practice—one that holds tremendous promise for streamlining business processes, improving service, reducing expenses, and empowering content consumers. If you’re interested in learning more about how to create, manage and deliver intelligent content, consider attending Intelligent Content 2009 in Palm Springs, CA. This intimate educational event—brought to you by The Content Wrangler and The Rockley Group, takes place January 29-30, 2009 at Le Parker Méridien Palm Springs. Early bird discount tickets are still available. Register today to reserve your seat!

About Ann Rockley

Ann Rockley is President of The Rockley Group, a consultancy that has an international reputation for developing customer-centric enterprise content management strategies and underlying information architecture. Rockley is a frequent contributor to trade and industry publications and a featured speaker at numerous conferences in North America and Europe.

She has been instrumental in establishing the field in online documentation, single sourcing (content reuse), unified content strategies, and content management best practices. Rockley is co-chair of the OASIS DITA Enterprise Business Documents Subcommittee. Rockley led Content Management Professionals to a prestigious eContent 100 award in 2005.

Rockley is a Fellow of the Society for Technical Communication and has a Master of Information Science from the University of Toronto. Rockley is the co-author of the best-selling book, Managing Enterprise Content: A Unified Content Strategy (New Riders Publishing ISBN 0-7357-1306-5).

PowerXEditor Eases Online Collaborative Authoring and Workflow

The Content Wrangler - Tue, 2008-12-02 19:24

By Rahel Bailie, special to The Content Wrangler

As the market develops for XML tools that work for a growing number of user types in a variety of industries, we are starting to see creative products come on the market. One need that is becoming more common is for the average business author – that is, someone who is not a technical writer and hasn’t studied the craft of writing – to create documents that validate to an XML DTD.

The difference between the needs of the typical business user differs vastly from those in the technical production environment. Technical writers and developers want to be exposed to the XML. The ability to see what’s happening behind the curtain, so to speak, allows authors to control the tagging and ensure that whatever they create works for their readers as well as creates valid XML. The complexity of the content, when taking into account things like conditional text, variables, and multi-channel publishing outputs, makes the exposure to the XML tags and elements critical.

In the business world, exposure to the XML is not necessarily a welcome strategy. Authors who may be subject matters experts in law, medicine, science, or business may not even know what XML stands, let alone want to learn how to use it. Their needs are to create content that conforms to a documentation standard, such as the APA guidelines. They want to concentrate on the content, not on the form. Yet, the organization needs valid XML documentation in order to publish, whether that means submitting the content to academic journals, or stream the content into their own publication software.

Aptara Inc. has developed PowerXEditor, a full XML editor complete with workflow, to address exactly this conundrum. The company, based in Falls Church, Virginia, created an XML editor that has balanced the needs of the non-technical author with the need to have valid XML output for the publishing needs of their organization. Their editor protects the users from the complicated side of XML, thus preventing distractions from the authoring process, while ensuring that the content they create can be validated against the organization’s DTD.

PowerXEditor is part of the PowerSuite of tools, which provides an online authoring environment that allows non-technical authors to create their material online and submit it directly into the publishing process. The built-in workflow sends completed work to an editor for review and comment, and lets the editor either return an article to the author for further polishing, or to approve it and automatically add the article to a bank of approved content.

XML advocates will undoubtedly ask how valid XML can be created, when the author isn’t exposed to the underlying rules of the DTD. This is handled in PowerXEditor by relaxing the rules for the writers. During the writing process, authors can work much like they do in a word processor. The key differences involve protecting the XML structure by disallowing functions such as adding a text box, or by limiting the copy-and-paste function to plain text. However, most of the menu bar items function much the same as an online text editor.

For example, during the creative process, a writer might create a list with only one bullet item, then return to the article later to add more bullet points. Or perhaps the writer adds a citation that doesn’t conform to the APA standard notation. The editing tool allows the writer to work, uninterrupted by error messages, until the writer decides the article is complete. For authors, completion is when they send their content in to the editors who will review and approve their work, or re-route the work back to the writers for further polishing.

After the content has been approved, and as the content is published, the XML is then validated. Any references are resolved, citations made to conform to the publication standard, figures properly labeled, and other anomalies fixed, in an automated process, by the software.  The automated validation process ensures that the content conforms to the organization’s DTD.

According to Aptara, about 95% or more of the validation decisions involved with relaxing the XML for authors are discovered and repaired by the parser and heuristics within the application. Aptara technicians intervene in the remaining cases where a rule cannot be resolved by the application, as part of the customer support services offered as part of the standard support package.

Using role-based permissions, PowerXEditor controls the writing environment, from the login and dashboard, right through the process. The dashboards differ by role, giving writers, reviewers, and administrators separate sets of controls designed for them.

The PowerXEditor authoring tool is meant for use in publishing environments where writing is more than an occasional activity. For an organization to invest in software that unifies the tasks of authoring with workflow that takes that content through its review/revision and approval stages, the writing is likely high-value content that is published to strict standards, and involves authors who are subject matter experts (SME), and a publishing process that includes roles such as editors, art directors, and so on.

The product contains the type of workflow capabilities that one would expect in a product with this robustness. The workflow is configured by Aptara, with certain functions remaining configurable by the administrator. This builds in certain workflow flexibility, such as skipping a step or re-routing an article to someone not usually in the sign-off loop. The Administrator has access to a number of options usually associated with a content management system (CMS); when integrated with a CMS, these settings are controlled directly by the CMS.

The .NET framework, combined with the power of AJAX and Java, means that the browser-based application is cross-platform and cross-browser compatible. There is no software to download or install, and the controls are simple enough for any author to quickly learn. The content remains format-neutral – in other words, in XML – so there are none of the barriers posed by proprietary systems. PowerXEditor can integrate with a content management system or other workflow system through its application programming interface. Though a Microsoft SQL database controls content and workflow, the XML editor can be integrated with other systems.

The product is meant for multiple publishing channels, with a strong supports built in for publishers of print. In addition to standard features such as word count, PowerXEditor feeds high-end composition from InDesign and Quark to LaTeX. Its mini-castoff calculator provides a much more accurate idea of page layout restrictions early on, supporting editorial staff in their decisions about how to judge the amount of content in relation to the available number of pages.

For more information about PowerXEditor or the entire Aptara PowerSuite, contact one of the Aptara offices, at http://www.aptaracorp.com, or contact them at info@aptaracorp.com.

About Rahel Bailie
Rahel Bailie helps content-driven companies create and better manage their communication strategies. Her firm, Intentional Design, serves clients in and around North America and Europe. Rahel is a popular presenter at content industry conferences, is a frequent contributor to trade and industry publications, and leads several online discussion groups, including Content Convergence. Contact Rahel via email or via telephone at +1 604-837-0034.

The Power of the Crowd: Finding DITA Resources and Information

The Content Wrangler - Mon, 2008-12-01 16:25

You can find a wide variety of groups that focus on DITA on The Content Wrangler Community, the global network for content professionals. Membership is free and requires online registration. Each group is moderated by an industry expert. Some of the most popular groups include:


DITA Metrics: Developing Cost Metrics

The Content Wrangler - Wed, 2008-11-19 04:25

By Mark Lewis, special to The Content Wrangler

Table of Contents

Introduction
Cost of Metrics Overview
Cost of a Project
Cost of a Topic
Cost of User Guides Without Topic Reuse
Cost of User Guides With Topic Reuse
Cost Comparison: User Guides With and Without Topic Reuse
Cost of a Reusable Master Topic
Cost of User Guides With Reusable Master Topics
Cost Comparison: Topic Reuse Versus Reusable Master Topics
Conclusion
About the author

Introduction

You’ve read all the papers (and attended all the webinars) on return on investment (ROI) for XML and you get it. You’ve already concluded that moving to the Darwin Information Typing Architecture (DITA) will likely save you tons of time and money. But management says, “Prove it!”. This paper helps you determine the cost portion of the ROI calculation. What are my costs now? What will my new costs be with DITA? And what is the difference—my savings? This white paper is the first in the DITA Metrics series. The series will discuss cost metrics, reuse metrics, and a reuse strategy. This paper is the first in the DITA Metrics series. It describes one model for calculating the cost of a DITA project. After doing some content analysis on your own documentation set, you can customize this cost model to suit your documentation project needs. In the end, you should be able to speak the financial language of managers and prove to them in dollar signs the value of moving to DITA.

To benefit from this article, you should have at least an intermediate level understanding of DITA including topic structure, elements, conrefs, child maps, and filtering/conditional processing.

For your convenience, we’ve provided a downloadable PDF of this article.

Cost Metrics Overview

The cost to develop content and reuse percent values are standard components in many ROI calculations. We need the cost and reuse values for a DITA project to determine DITA ROI.

This paper focuses on the cost of content creation and introduces various levels of reuse into the model. We’ll begin with a deep dive into the cost of creating DITA topics and then incorporate the cost of unique content, identical content and similar content.

Over the course of the series, we will discuss the following components of our cost model:

  • Cost of content creation (in this paper)
  • Cost of content analysis and inventory
  • Cost of review and project management
  • Cost of filtering
  • Cost of publishing
  • Cost of content maintenance
  • Cost of converting legacy content to DITA
  • Cost of translation

Cost of a Project

Let’s show how our model is used to determine the cost of creating user guides for three models of a fictitious personal digital assistant (PDA). PDA One has a base set of features. PDA Two has all the features of PDA One, plus several additional features. PDA Three has all the features of PDA One and PDA Two, plus several features unique to PDA Three. This documentation project contains lots of identical content and unique content.

The first version of our documentation project is relatively simple so that the associated cost model is also simple. Later, we’ll introduce more of the content reuse and conditional reuse features of DITA, and show how to incorporate these into the model. Gradually, we’ll increase the complexity of our project and our cost model.

Cost of a Topic

The first step is design the content creation component of the model.  Traditional cost metrics focus on the cost of a page [1]. Since pages are similar to topics, we will start our discussion with determining the cost o creating DITA topics. Costs are expressed in terms of content creator labor hours.

Table 1 shows the cost of creating Task and Concept topics. For simplicity, we exclude Reference and other topic types, but you can easily customize the model to include them if needed. The scope of these estimates is a topic and does not include time for project/publication level activities such as designing the document outline, user task analysis, project management, implementing context-sensitive help, testing, status meetings or design meetings.

Table 1 (Cost of a Topic)



*Includes time to learn the product through interviews or research.
** Screen shots/images: Includes time to create sample data that would be shown in screen shot, capture the image, convert the format, name the image using naming conventions, and store it in the repository.
*** The elements listed here are optional. For example, Concept topic > screen shot, or Concept topic > feature description.

Now we have an approximate cost range in hours for creating Task and Concept topics that we can use in our model.

Cost of User Guides Without Topic Reuse

Table 2 through Table 5 shows the cost of developing a user guide for each of the three PDAs in DITA without taking advantage of topic reuse.

Table 2 (Cost of PDA One User Guide)

Table 3 (Cost of PDA Two User Guide)

Table 4 (Cost of PDA Three User Guide)

Table 5 (Total Cost All User Guides Without Topic Reuse)

This is really a worst case scenario that is not realistic because moving to DITA and not reusing any content is highly unlikely. But this simple project is a good starting point for the model and a base to which we can add reuse features of DITA. As you will see, the reuse features that you incorporate can be different for each documentation project.

Cost of User Guides With Topic Reuse

Now we’ll incorporate the cost of reusable topics (reusable in more than one user guide) in our cost model.  Reusing topics is nothing new. This feature has been available in help authoring tools for more than 10 years.

Table 6 through Table 10 shows the cost of developing a user guide for each of the three PDAs taking advantage of reusable topics. Some topics for the PDA One user guide may be reused verbatim in the PDA Two and PDA Three user guides because the topics are identical.

Table 6 (Cost of Reusable Topics)

Table 7 (Cost of Topics Unique to PDA One User Guide)

Table 8 (Cost of Topics Unique to PDA Two User Guide)

Table 9 (Cost of Topics Unique to PDA Three User Guide)

Table 10 (Total Cost All User Guides With Topic Reuse)

Cost Comparison: User Guides With and Without Topic Reuse

Although the scenario without topic reuse is highly unlikely and not realistic, just to be thorough, we are showing a cost comparison in Table 11 through Table 13.

Table 11 (Total Cost All User Guides Without Topic Reuse)

Table 12 (Total Cost All User Guides With Topic Reuse)

Table 13 (Savings)

For over a decade, significant savings have been achieved reusing topics in multiple publications.

Cost of a Reusable Master Topic

The project is simple when topics can be reused verbatim in multiple publications. But what happens to our model when there is sufficient variation in our products that we cannot write a single topic to describe a given feature? Perhaps the product screen shots are different, an extra note or warning is needed, or a button has a different label. For all three user guides, some topics are similar. Most of a similar topic is the same for each PDA and can be shared. So, if all three versions of that content are included in one topic, then all versions of the user guides may be published from this topic. Using filtering metadata, content that is unique is marked as belonging to a specific PDA. When the user guide for a specific PDA is published, content that is specific to the other PDAs is filtered. The filtering feature in DITA is also known as conditional processing and it is what allows us to create and use reusable master topics.

Table 14 shows the cost of creating reusable master Task and Concept topics.

Table 14 (Cost of a reusable master topic)



It would complicate our model to incorporate conrefs to a large variety of content types. Therefore, we are limiting the incorporation of conrefs in our model. We’ll discuss conrefs later, but for now a simple example is to reuse feature descriptions or screen shots that were created in Task topics by conref’ing them in your Concept topics. This would reduce the cost of the Concept topic by several hours. You will be able to customize the model to incorporate your use of conrefs in your project.

Now we have the estimated cost for reusable master Task and Concept topics that we can use in our model.

Cost of User Guides With Reusable Master Topics

Now we’ll incorporate the cost of reusable master topics into our cost model.

Table 15 through Table 16 shows the cost of developing a user guide for each of the three PDAs taking advantage of reusable master topics.

Table 15 (Cost of Reusable Master Topics)

Table 16 (Total Cost All User Guides With Topic Reuse and Reusable Master Topics)

Table 17 compares the cost of creating unique topics to reusable master topics.

Table 17 (Cost of Unique versus Reusable Master Topics)

Reusable master topics cost significantly more to create, but with proper planning they can be used in multiple publications such that the overall cost of content creation for your publications drops dramatically.

Cost Comparison: Topic Reuse Versus Reusable Master Topics

Table 18 through Table 20 shows a cost comparison of a project taking advantage of reusable topics to a project that is taking advantage of both reusable topics and reusable master topics.

When filtering and reusable master topics are used, the number of unique topics and reusable topics that have to be created is drastically reduced. This is where the cost savings of using DITA really leaps ahead of conventional help authoring tools and technology.

Table 18 (Total Cost All User Guides With Topic Reuse and Reusable Master Topics)

Table 19 (Total Cost All User Guides With Topic Reuse)

Table 20 (Savings)

Conclusion

“What is the cost of DITA?” is the question that is the focus of this paper.

Chances are that if you are reading this paper, then you’ve read the papers on the promise of XML, the ROI of DITA, the clear savings in translation costs alone, and you get it. However, you are being asked to justify the cost. How much is it going to cost to develop and publish content using DITA? Knowing percent reuse is important, but equally as important is the upfront cost? And, the savings are in more areas that just translation. There are savings in both content creation and content maintenance due to content reuse and conditional reuse.

The cost to develop content and percent reuse are often discussed separately. But to know the cost of a DITA-based documentation project, both of these values need to be accurately incorporated into a cost model. In this paper, we focused on cost metrics. Later in this series we will focus on reuse metrics.

For a given project, we must know the cost to create content without reuse and the cost with reuse. The difference is the savings. The cost to create content with reuse is equal to the cost to create the unique content and the reused content. That’s the upfront cost. Knowing this will allow us to more accurately calculate the cost of translation as well. What must be translated is reduced to unique content and reused content.

In Part 1 of this paper, we covered the following cost metrics:

  • Cost of content creation
    • unique content
    • reused content
  • Cost of a Task topic
  • Cost of a Concept topic
  • Cost of user guides without topic reuse
  • Cost of user guides with topic reuse
  • Cost of a reusable master topic
  • Cost of user guides with reusable master topics

The most important things we accomplished in this paper included determining the cost of creating a DITA-based topic rather than a traditional page. We then incorporated conditional reuse/filtering and determined the average cost of creating a reusable master topic. We observed that the biggest savings resulted when reusable master topics are incorporated. The flexibility and diversity of conditional reuse in DITA differentiate it from typical help authoring tool technologies and offer greater savings in not only content creation, but also content maintenance.

The ultimate question.

In Part 2 of this series (coming soon), we will look at other costs that need to be accounted for in a DITA project:

  • Cost of content analysis and inventory
  • Cost of review and project management
  • Cost of filtering
  • Cost of publishing
  • Cost of content maintenance
  • Cost of converting legacy content to DITA
  • Cost of translation

Other papers in the DITA Metrics series (coming soon!):

  • DITA Metrics: Cost Metrics – part 2
  • DITA Metrics: Reuse Metrics
  • DITA Metrics: Reuse Strategies for Minimizing Cost

Join the DITA Metrics Group
Interested in learning more about DITA metrics? Join the DITA Metrics group on The Content Wrangler Community. The group is lead by the author of this paper, Mark Lewis.

About the author

Mark Lewis works for YOU, the tech writer, and the information architect. Mark has received STC awards for Distinguished Chapter Service and the Florida Technical Communications Competition. Currently, he is the DITA Product Manager for Usability and a product Evangelist for Quark. He provides product direction and user experience designs for Quark’s structured authoring products allowing everyday authors to create reusable content without knowing the details of XML syntax. Mark presents at conferences and other industry events on topics including: object oriented design methodologies (for non-programmers to help jumpstart their understanding of structured authoring), information architecture, and the promise of DITA and XML. Mark manages The Content Wrangler Community Groups – WritingOBJECTively and DITA Metrics. Send questions or feedback to or hyperwriters@hotmail.com.

------------
1. For more information on traditional cost metrics, see Lasecke, Joyce. “Stop Guesstimating, Start Estimating. In Intercom, February 2006 issue, Society for Technical Communication. Return to article

DITA Metrics: Mark Lewis on Calculating The Cost of a DITA Project

The Content Wrangler - Wed, 2008-11-12 20:34

By Mark Lewis special to The Content Wrangler

You’ve read all the papers on return on investment (ROI) for XML and you get it. You’ve already concluded that moving to the Darwin Information Typing Architecture (DITA) will likely save you tons of time and money. But management says, “Prove it.” This paper was created to help you determine the cost portion of the ROI calculation. What are my costs now? What will my new costs be with DITA? And what is the difference—my savings? This is the first in the DITA Metrics series of white papers. The series will discuss cost metrics, reuse metrics, and a reuse strategy. This installment describes one model for calculating the cost of a DITA project. After doing some content analysis on your own documentation, you can customize this cost model to suit your documentation project. In the end, you should be able to speak the financial language of managers and prove to them in dollar signs the value of moving to DITA.

[Note: To benefit from this article, you should have at least an intermediate level understanding of DITA including topic structure, elements, conrefs, child maps, and filtering / conditional processing.]

The DITA Metrics series:

  • DITA Metrics: Cost Metrics
  • DITA Metrics: Cost Metrics – Part 2
  • DITA Metrics: Reuse Metrics
  • DITA Metrics: Reuse Strategies for Minimizing Cost

Alfresco Is Not A Picnic: The Problem With Metaphors And Content Management Systems

The Content Wrangler - Tue, 2008-11-11 16:06

By Felice Bochman, special to The Content Wrangler

“Humans are wired to put things in buckets.  We have an innate need to create categories and sort things into them.”—Richard Hamilton on content delivery, The Content Wrangler, September 2008

Many editors and other language-oriented professionals I know look for metaphors as a way of figuring things out.  We tend to see the world through a “this is like this” lens complemented by a “how does this fit into the big picture” manner of thinking. At its most elemental, this is simply a way of bucketing things. It’s a matter of perception not taken lightly—I hope.

What “things” do we humans bucket?  Anything, really, but for editors this usually applies to content of some kind. And, once we have content of some kind, it’s only a hop, skip, and a jump to content of a particular kind, as in content of one type or another. Alas, content doesn’t magically appear on web pages. We “webitors” must use a CMS to publish language to our websites.  In and of itself, the idea of a CMS is no big deal—one usually needs some kind of delivery system to display or publish language pretty much anywhere in any medium—unless mental telepathy is involved. But, it’s how the CMS is structured that is at issue.

In the content management system I currently use, I’ve noticed no less than nine metaphors, which are meant serve as organizing principles, but they don’t. Granted, the particular tool I use isn’t really meant for gobs and gobs of editorial work, but nonetheless its organization and structure were likely created by a developer within arm’s reach of a bottle of tequila.

Really.

Imagine the following. You’re brand new to the publishing tool you will use to do your work. You’re neither a stranger to web publishing, nor to any of several content tools you’ve used in the past. So, what’s so special about this one?  This one is Alfresco.  (Cue the choir.)

The name connotes the outdoors, picnicking, of a la piscine, of spring—not to mention something about renaissance painting (that would be a fresco but still, close enough). You get the idea—pastels, a breeze, distant laughter, potato chips—all swirling about erroneously in this identity crisis we call a publishing tool (not that there’s anything wrong with that).

Is it not true that when you encounter something brand new and unknown, the first thing that comes to mind is, “Do I know anything like this?” Or, based on what I do know or can relate to, “how should I get around this thing and make it work?” One faces the fear of the unknown by resorting to the familiar. It’s like driving in England—you know the deal—car, steering wheel, directional, seatbelt, traffic, horn, and round abouts, except that you’re on the wrong side of the street (and you could die instantly—unlike using Alfresco, though it feels like you want to). The reality is that there are complicated heuristics lurking behind our human ability to make the aforementioned leap of faith when faced with the unknown.  I mean, there’s no proof that using something familiar can help you decipher the unfamiliar.  Right?

Enter Alfresco. 

See company home.  Oh, good!  Home.  I know that!  It’s a house, with a roof and rooms, and I live there with my family.  Good!  Let me get this content up on our site.  I suppose I need to find a “door” or perhaps a “hallway” or “room” which would signify the next logical step in my trek though this “home.” Perhaps I’ll find a “kitchen” where I will cook up content, or maybe an “attic” or a “basement” where I will store the stuff.

But no!  No. No. No.

I leave the “home” and go to a playground—something for children.  Well…it must be a playground as I’m supposed to find a “sandbox” there. Ok, fine—maybe it’s in the backyard of the home.  I’m sure that once I’m in this sandbox, I’ll see some other playground equipment that will help me get this content on the site. Or, I’ll see some other children and they will help me. 

Like a kid new to the neighborhood, I find my way to the fields I need—a mere eighteen clicks away (that’s not army chat for 1.6 miles in case you were wondering). This might be a slight exaggeration, but when you’re new to the ‘hood, your perception of distance may be skewed. I am armed with keywords and categories and conversion tools, which for some reason are tiny—even tinier than I am (just over five feet).

But again, no!

Quick, quick, close my eyes and click my heels together—there’s no place like home, there’s no place like home.  Crap.  I’m still in the root folder (potato, rutabaga, garden?).  Maybe I’m not wearing the right shoes.

The content seems to be in there somewhere, but there’s a sudden invasion—the military has arrived! I’m faced with a deployment.  My thoughts turn to troops, war rooms, strategy, weapons, and uniformed men flicking their laser pointers at topographic charts. It all seems very far away from home or even from the sandbox where I once was. It’s okay, though.  I’m smart and flexible and I can switch metaphors again.

I’m in the army now. 

What modified items have to do with the army (perhaps a SNAFU or something gone FUBAR), I really don’t know, but I’m willing to give the military metaphor a chance in hell.  I submit the chance in hell.  Excellent.  I’m ready for the next maneuver from my commanding officer, Colonel Alfresco. But, alas.  He is gone.  He went away and a photographer took his place.

What the . . . ?!

This photographer cannot possibly be a war correspondent with a camera.  No—he’s just taking a snapshot.  What happened to my war?  The troops?  The plan of attack?  The battle?  Gone, gone, gone.  Sudden peacetime is a little distressing because I don’t have a camera or film or even subject matter to photograph.  There are no lenses, f-stops, apertures, tripods, light meters or anything even remotely photographic other than the mysterious snapshots.

Home. Sandbox. Root. Deployment. Snapshot.

What do the above have in common?

Nothing.  Zero.  The null set.

And that wasn’t a trick question.

You could force a commonality by saying, “a sandbox might be in the backyard of one’s home,” as I mentioned earlier in this article. But that’s a stretch.  No.  Actually, it’s a Hail Mary—and that would mean we’re talking about football.

Does this seem kinda random to you?

I pause to check my location and am hoping for something like a GPS metaphor (a.k.a. a map).  Where the heck am I?  Oh, I see.  I’m in production.

For real?  I don’t know if I can squeeze in another metaphor without having to buy rush tickets, if you know what I mean.  You don’t?  That’s funny, I mean, you look so much like someone I know. Where is wardrobe when you need them?

Anyway…

I have apparently arrived on Broadway, during a deployment, with the snapshot I found in the sandbox in the backyard of my home. And the potato. The god of dynamic systems is laughing his ass off.  I am possibly in a novel by Kundera.

There is only one thing left to try.

Space travel.

You laugh, but it is indeed the truth.  I’ve fallen off so many metaphors that I’m sure at least one snapshot will need fixing. In order to do that, space travel is involved.  This is very lucky because I was just at Epcot Center where I rode this cool space ride.  I was the virtual navigator of my own ship (my frightened flight team with barf-bags in tow).  I see the navigator tool.  And good heavens when I click it there is a shelf (hey, that makes ten metaphors, no wait, eleven if you include the potato, and twelve with the theater fiasco). Does this mean there’s a bookcase or library on my spaceship? No, of course not.  That wouldn’t make any sense at all. Perhaps it’s a continental shelf, though I really hope it isn’t as I’d prefer not to introduce a geological or oceanographic metaphor to this affair (though I do like to swim in the ocean).

By some miracle of semiotics, I launch the content I came to publish.  I don’t know what else to call it.  I was on a spaceship (or perhaps it was a boat), so the least I can do is launch…something.  If the content is a rocket, then I have just launched it from a…sandbox?  Houston, we have a problem.

I’m fresh out of metaphors, save one.  Though there is no chocolate factory in sight, there is a Charlie and a Chuck (two genius tech dudes in my office who, lucky for me, are very kind and often bemused)—and they are worth their weight in golden tickets.  HEY YOU GUYS!!!

Are there Oompa-Loompas in here too?

I’m hopelessly, hopelessly literary, but that doesn’t make me a technophobe.  In fact, I crave meaning within systems, pathways to usage that make sense, and sense which is based on experience of the empirical kind.

Build a tool.  Pick a metaphor to represent the structure and functions—extend the metaphor as needed—but use only one.  Trust this editor—that metaphor will contain all the buckets you need.

As for the picnic—I said it before—there’s no place like home, in your backyard, in the sandbox, potato salad and all.

Microsoft, Welcome to the SaaS World (and See You in a Year)

The Content Wrangler - Mon, 2008-11-03 18:02

By Rodrigo Vaca, Director Marketing, Zoho

Microsoft confirmed recently a widely circulated rumor and announced, with gran fanfare, that next year they will be announcing a web-based version of their Office product. Yes, you read it right, Microsoft announced that they will be announcing… You can read the full story in PC Magazine, ComputerWorld and many other publications.

The question for many will be… what does this announcement of an announcement means for Zoho and other SaaS vendors? It’s simple: it means two things.

First—it means fantastic news! Microsoft had been pooh-poohing the whole SaaS world… even going as far as denying the inevitable and creating its own Software-plus-Services trend-of-one. But Microsoft took one big step forward, and added some extra validation to the whole concept of productivity applications delivered using nothing but a browser.

Second—and particularly to Zoho—it means business at usual. Will there be increased competition in the on-line productivity space? You betcha. But it’s not like we had a monopoly on that market to start with. We thrive on competition. We have multiple competitors for each and every one of the 19 (and counting!) different services we provide. That only makes us better. But beyond that, our users get value from having so many tightly-integrated SaaS applications. Zoho is much more than the on-line productivity suite. We have the most comprehensive portfolio of on-line productivity and business applications.

Maybe the real question is—what will this mean for Microsoft? What will this mean for their business model and their uses?

For their business model—I wonder if they’ll charge the same for the on-line version as they charge for the old, dinosauric version? Are they still going to be able to collect the absurdly high CAL fees for office users? They surely risk loosing a grip on the desktop, as well, you don’t need Windows to run applications on a browser.

In any case, and beyond the business implications—let’s see how this works for the most important folks on earth: users. If Microsoft MSN Live Search, Microsoft MSN Live HotMail and a host of other Microsoft on-line products are a proof of Microsoft’s Internet prowess…

Microsoft, welcome to the SaaS world. See you in a year (or so).

P.S. I’m taking bets on the simple and elegant name the Microsoft on-line office will get. I’m betting on: Microsoft Office Live 2010 Standard Web Edition.

About the Author
Rodrigo Vaca is the director of marketing at Zoho. Rodrigo joined Zoho from Google, where he was responsible for driving international and channel marketing on the Google Enterprise team. Prior to that, Rodrigo was Product Manager for the SharePoint product at Microsoft. He received his undergraduate degree in Computer Science at Mexico’s ITESM and later earned his MBA at Stanford University.

Information Visualization: A Look At U.S. Newspapers And Their Picks For President

The Content Wrangler - Wed, 2008-10-29 13:18

This interactive map from the folks at 1000 Words is a collective representation of the endorsements made by US newspapers for the 2008 presidential candidates. As of the date of this posting, Senator Barack Obama (D) leads Senator John McCain (R) in the number of endorsements he has received from major newspapers.

To view individual newspaper endorsements, click on the corresponding red or blue balloons. Use the zoom feature to target the papers you’re interested in viewing in markets in which multiple major papers exist.

According to 1000 Words, the newspapers represented on the map were gathered from the 100 largest U.S. newspapers by circulation. In states where no newspaper ranks in the top 100, the paper with the highest daily circulation in the state—according to data provided from the Audit Bureau of Circulation was included. National newspapers like USA Today were not included. In some larger markets in which more than one major news outlet exists, multiple newspapers may be included Cities with more than one paper may overlap (i.e. The New York Times, New York Post, and Daily News).

The creators of the map like to remind everyone that the map “is not an endorsement of any particular candidate” and that is will be updated to reflect new endorsements as they become available.



Content Reuse: Is It Harmful?

The Content Wrangler - Mon, 2008-10-27 12:42

By Richard Hamilton, special to The Content Wrangler

A place for everything and everything in its place—Isabella Mary Beeton, The Book of Household Management, 1861

For a number of years it has been a matter of faith that the more content a technical documentation team reuses, the more efficient they are presumed to be. Vasont Systems, a content management system (CMS) vendor, claims its users average 71% content reuse. That is a bold claim, but I suspect that if you could show even 30% or 40% content reuse, you would earn bonus brownie points with nearly any manager. But, are you really more efficient? Let’s take a deeper look.

Terminology

  • Duplication: You separately maintain more than one copy of some piece of content in source control or a Content Management System (CMS). If you keep two copies of a glossary definition in your source control system, that would be duplication.
  • Reuse: You put the same piece of content into more than one deliverable in the same output medium. If you have just one copy of that glossary definition in source control, but include it in the printed versions of your Installation Guide and User’s Guide, that would be reuse.
  • Single sourcing: You deliver the same piece of content via different media. If you deliver the Installation Guide in print and also on the web, that would be single sourcing.

[Note: Any given piece of content can be reused or single sourced or both. Defining single sourcing and reuse separately may seem to be nitpicking, but the distinction is important.]

Why Minimize Reuse?

I doubt anyone would argue against minimizing duplication. The benefits are clear and the exceptions are relatively few. I also agree wholeheartedly that single sourcing makes very good sense. But, I differ with the mainstream regarding reuse. I believe you should minimize reuse, not maximize it.

There are two main reasons for minimizing reuse:

  1. Every time you reuse content, you give your users another place to look at when they search for that topic. If you have the same content in several different places, your users can end up jumping around among those places, trying to figure out which one they should use. Having one, authoritative place for any particular module will simplify their search and avoid confusion.
  2. Even with highly structured methodologies, reuse is not free. When you reuse content, you need to take steps to be sure that content will work in multiple locations. This takes effort that might not need to be expended for content that is not reused

Driving out Duplication

Most efforts to maximize reuse start by looking for and driving out duplication. The search for duplication typically identifies places where there is an exact or close match between two or more pieces of content. For each of those matches, you have three choices:

  1. Continue to maintain two (or more) versions of the content.
  2. Merge the matching content into one version, store that version in source control, and use it in each of the original locations as you build deliverables.
  3. Remove all but one instance of the content, and if you must, point to it rather than copy it.

All three of these choices cost something. Choice one abandons reuse, giving you more content to maintain. That is not always a bad choice; if there are enough differences in the content to make maintaining one version more expensive than maintaining two versions, this might be a valid option.

Choice two is classic reuse; you will have some additional work making the module work in multiple contexts, and you will have some additional work over time maintaining that independence. But, you will usually save effort over maintaining separate versions.

Choice three eliminates duplication and reuse. If you can eliminate all but one of the situations that used the content, you have not only eliminated the duplication, you have reduced the overall size of your deliverables. When it works, this choice is the most efficient of the three.

The Bias Towards Maximizing Reuse

I see a bias towards choice two in most of what I have read about content reuse; in fact, often choice three is nowhere in sight. While structure is given its due, I see little discussion about structuring content to minimize the need for reuse. Several factors fuel this bias:

  • Metrics: It is easier to create a metric to measure reuse than it is to create one to measure where you have avoided the need to reuse.
  • Human nature: Choice three requires you to eliminate content. Since nearly all content was originally generated because someone needed it, your natural inclination will be to keep content, even if it is redundant.
  • Content Management Systems: The typical CMS makes reuse easy. Just mix and match modules, push a button, and poof, you have a new deliverable.
  • Structure: Choice three requires you to look more deeply into your structure, which your team may not have the time or inclination to do.

If unchecked, these biases can leave you with a lot of unnecessary reuse. You can argue that’s not a big deal, but even when well structured, a heavily reused module will take more maintenance than one that is used in just one place. In addition, it will needlessly increase the bulk of your deliverables. Both of these factors decrease efficiency. If you are serious about maximizing your efficiency, you need to structure your documentation with a bias against both duplication and reuse.

Implications for Modular Documentation

So, am I arguing against modular documentation? No. Consistent structure and style help people use your documentation. And good methodologies give your authors the guidelines they need to produce consistent structure and style. Where things go off the rails is when you try to treat your documentation as a set of modules that can be indiscriminately mixed and matched to create whatever deliverables you want.

Content that is central to your message deserves a context within which it can live. If it is pulled out of context, it will either be confusing, or it will require additional information to provide that context, either as part of the module itself, or in the including document.

Jon Bosak summed up the problem nicely in his Closing Keynote at the XML 2006 conference:

Another ancient subject that seems to be popping up again is the idea of modular document creation. This is one of those concepts that comes through about once a decade, seduces all the writing managers with the prospect of greater efficiency, takes over entire writing departments for a couple of years, and then falls out of favor as people finally realize that document reuse is not a solvable problem in document delivery but rather an intractable problem in document writing – which is, how to retain any sense of logical connection between pieces of information while writing as if your target audience consisted entirely of people afflicted with ADD.

While I do not have quite as pessimistic a view of modular documentation as Bosak expresses here, I do think that maximizing reuse without considering context and structure yields documentation that is difficult to use. Even if your structure allows you to easily reuse modules, there is benefit in doing so only when you have a compelling reason.

What I advocate is to first build a structure that minimizes the need to reuse content, then judiciously choose where, within that structure, you will reuse. Obvious cases include glossary entries, legal boilerplate, and repeated procedures. And you will find other places where it simply makes sense to include content rather than sending the user off somewhere else to find it.

If you start with Isabella Beeton’s words in mind, you will end up with less reuse, better structured documentation, a more efficient process, and maybe customers who do not feel you are inflicting ADD upon them.

About the Author
Richard Hamilton is principal consultant with R.L. Hamilton & Associates, specializing in documentation management and the application of XML technology to documentation. He is the author of the forthcoming book, Managing Writers: A Real World Guide to Managing Technical Documentation, which will be published by XML Press later this year. 

Economic Woes Signal Content Industry Job Losses: It Could Happen To You!

The Content Wrangler - Tue, 2008-10-21 16:17

By Maxwell Hoffmann, special to The Content Wrangler

It was just another beautiful autumn day in Portland, Oregon. Since it was October 1st, I swiped my credit card and bought my monthly MAX train pass for $86. If I’d been psychic, I could have bought the $4.75 one day pass. I wouldn’t need the train for awhile.

I had barely put my bag down in my office when my boss told me we needed to meet privately. After leading me into the conference room he placed papers on the table and said “Max, it’s not you, it’s the economy.” I didn’t need to look at the papers to know what was happening.  There was a lay-off.  And my name was on the list, along with others.  Less than two hours later I was on the train home with an empty computer bag, my usual book bag and the oh-so obvious card board box overflowing with personal contents from my desk and adjacent walls. The woman across from me gave me a sorrowful look, and then averted eye contact.  She moved to the other end of the train as soon as another seat opened up. I managed not to cry until I got into my car. So, was this how it was going to be?

Through subsequent days I moved through total shock and numbness, to tears, to anger and finally returned to rational thought.  All normal emotions. Although my instincts are to paint a pretty picture, I’m sharing my raw experience because this could happen to you. If it doesn’t, it will happen to someone you know before the end of the year. Going through this has taught me a lot of lessons, both in terms of resources to fall back on, and what not to say to a friend who’s been laid off.

The shock: my company had created a new position for me at the beginning of the year and had relocated me 1,000 miles north from Los Angeles. Before signing on a home purchase in May, I had asked my boss and one other source “is there anything I need to know that could put me at risk in the foreseeable future.” I was assured there was nothing to worry about, so I became a local homeowner.  Two days before the lay-off, my name had just gone out on a proposal. I was planning a second round of training with a major customer. I had recently received a bonus and was booked about a month in advance.  After 9/11 a previous employer had downsized from 167 employees to only 27 employees. Against all odds, I had survived all of those lay-offs. It was different this time.

“Although my instincts are to paint a pretty picture, I’m sharing my raw experience because this could happen to you. If it doesn’t, it will happen to someone you know before the end of the year.”

Now I have a mortgage. Now I have no income. The woman at my grocery check out stand has a job; I don’t. How could this be happening? “It’s the economy.” Since this happened on October 1st, before the biggest stock drop in history, one can only imagine how many others have heard those words across the country, in every segment of the workplace.  So, before someone in your company HQ decides to change your row in a salary-sorted Excel spreadsheet to pink, read on for some useful tips.

New tools for tough times

Fortunately, we have stronger social networking tools available than existed right after 9/11. Facebook, Plaxo, even Twitter have matured and made it easier than ever to connect and get the word out that you’re up for hire. And we still have CareerBuilder, Monster and the old war horses. But I have found LinkedIn to be the most potent tool for finding real opportunity.  I’m not out of the woods yet, but thanks to my LinkedIn connections I can see light at the end of the tunnel. (Note: all of my LinkedIn hints should apply to similar networking sites.)

  • Hint 1: If you haven’t done so already, sign up for LinkedIn.com.  Most new LinkedIn users start out limiting contacts to former co-workers and people they’ve met face-to-face. Other new users make the mistake of passively posting a resume and waiting for others to “find them.” You have to take the first step in order to build up useful contacts; the whole point of LinkedIn is to expand your connections through “friends of friends” who could form a mutually beneficial relationship. Ask for appropriate introductions. These connections will never be more beneficial than when you are out of work. (And what a time that is to find out who your real friends are!)
  • Hint 2: Let virtually every contact in your LinkedIn network (or address book) know what happened to you, what your qualifications are and exactly what you are looking for. Don’t assume that friends/contacts will just read your profile and connect the dots leading to your dream job. You need to mention specific industries, job types and geographical locations you would consider. So far I’ve had over 75 positive responses just from LinkedIn contacts in a little over a week. I have three telephone interviews scheduled early next week, with recruiters or potential employers that came from LinkedIn contacts I have never met face-to-face.
  • Hint 3: There is a lot of hidden power in LinkedIn. Though a potent tool, it does not have an intuitive user interface. There are three in boxes and two of them are capable of hiding “hidden” archived messages. (Hint: go through the tutorial several times.) Learn how to use the advanced search features on Linked In when searching for jobs and people. You can confine your search to people who still work for a certain company, and who live a certain number of miles from a certain zip code. The larger your network, the richer the results.
  • Hint 4: When you do identify a company with a position that seems like a good fit, used LinkedIn to search for former employees of that company.  Contact a couple of recent former employees and find out if your next potential employer has a history of (a) frequent re-organization (b) turn over in management or (c) high turn over in certain positions (like project managers). Does the company keep sales people for a year to build up cold calls and then replace them? The business intelligence you derive this way is literally worth its weight in gold, and the optional $19.95 per month to upgrade to “premium” LinkedIn.


How do you get noticed with 500 competitors for the same job?

Avoid “canned” or boilerplate responses to job listings. Write each cover letter from scratch (even the on-line ones) and always mention several points addressed in the job description.  Have multiple resumes prepared ahead of time that highlight a variety of strengths for different careers you may be focusing on. I have cover letters and resumes for five different job searches, (1) Translation/Localization, (2) Desktop Publishing/Production (3) XML and Content Management (4) Author/edit content and (5) Course Development and Training.  I have created a resume for each discipline which highlights appropriate career accomplishments and mentions relevant former customers or business partners.

So, what if you’re not laid off yet and you want it to stay that way?

Everyone needs a periodic career “make over” or re-branding, even if you intend to stay with the same employer and advance within its ranks. Due to the dynamics of the world economy, your job is becoming increasingly dependent on overseas economic health.  If your company is to survive and thrive, product and service sales will probably have to increase in non-English speaking markets.

There are a variety of things you can do to increase your value, especially if you are involved in content creation:

  • If you’re already LinkedIn, start requesting (and giving) some recommendations and endorsements. A few simple endorsements from co-workers and customers carry more weight than anything you can do to “blow your own horn”.  It has taken me 3 years but I now have over 30 endorsements. Rule of thumb, you will get one endorsement for every two or three that you post for others.
  • Master the principles of controlled or simplified English. Even if you don’t have software to support your effort, learn to create content that is unambiguous and always has a clear subject / object / action.  Why? Chances are that anything you create within the next two years could be reused in a non-English speaking market. Go to http://www.smartny.com for some samples.
  • Become aware of localization/translation issues. Attend a free webinar with ENLASO or one of the many other companies who offer such services. Find out how to create content with maximum leveraging or “re-use” of previously translated content. There are a host of seemingly innocuous document objects that can wreck havoc in translation when they are misplaced. (Hint: unstructured FrameMaker users, don’t put an index marker in the middle of a word.)
  • Learn some simple methods of authoring content using the Darwin Information Typing Architecture (DITA) and XML. Even if you have to use a free, downloadable editor, make the move into topic-based authoring. This is where the world is heading (with, or without you). Even if you don’t use actual DITA structure, with all of its benefits of references and reuse, topic-based authoring creates clearer content with higher reader retention.
  • Even if you have take a night class at a community college, learn to communicate visually through video, motion capture and similar techniques. Products like Adobe Captivate, Adobe’s Technical Communication Suite and others are blurring the lines between products that used to be associated with product design and authoring. Diagrams that contain *.swf files or interactive 3D diagrams help eliminate the need for pesky “call outs” that have to be translated.

What not to say to a laid off friend

But enough about you and your potential traumas. What about my pain and suffering?  Having never been laid off before, I hadn’t realized how stinging the handful of “insensitive” replies from friends can be. I’m sure that I’m guilty of having said at least one of the phrases below in the past to a laid off pal.  Here are some actual quotes from responses I got to a mass e-mail that made it clear I had been laid off, have a mortgage and need a job soon:

  • “Why don’t you just register with Adobe’s job board? Someone with your talent should be able to land a job in about a week!”
  • “If you were so great, why did they fire you?” (No, I’m not making this one up: I had to explain the term “Lay-off “ to one of my European friends.)
  • “Think of this lay-off as God’s gift to finally give you the time to find that job that you really love.”
  • “Why don’t you just become a Java programmer?”
  • “If you don’t truly love your work, someone else who loves it more can come along and take it away from you.”
  • “Have you ever thought of becoming a teacher in elementary school? I hear that there are shortages there now.”
  • “Why don’t you go into Health Care? I read a newspaper article that there are plenty of openings over there.”

If you don’t know of any possible leads or advice just be say so.  That gets you off the hook, and you avoid a silence that says “I don’t care.”

Offers of help come from unexpected places

I’ve learned that you should contact everyone, even friends who are already laid off. You can still find an opportunity that is good for someone else. Surprisingly, the most concrete assistance from LinkedIn contacts came from people I hadn’t seen for a long time, and those “friends of friends” I connected with some time back. I even had three people I’d never met ask to become LinkedIn connections just so they could share their contacts.  Evidently some of my contacts had shared my situation with them.

One thing this experience has taught me is that basically, most people, are good, concerned and genuinely want to help you in your time of need. One of those new contacts on LinkedIn shared a quote she had heard from an actress on Turner Classic Movies, “Remember, no matter how small the part, you are always as good as the best thing you’ve ever done.” That being the case, I know that I will land work, be it contract or permanent.  Mortgage payments shall be met. And eventually, small luxuries that I took for granted shall return.

And even though “it’s the economy,” I will be contributing to it again soon.

To see what I might have to offer your organization or customers, check out my LinkedIn profile. Or, contact me at maxwellhoffmann@gmail.com or +1 (503) 805.3719.

Effective Content Reuse: Storing Paragraphs, Not Topics, Is Key to Content Management Success

The Content Wrangler - Fri, 2008-10-17 10:20

By Paul Trotter, CEO, Author-it Software Corporation

Various reports have shown that knowledge workers spend about 30% of their time looking for content that has already been created.  If that sounds like a colossal waste of time and money, it is.

But in terms of waste, time and money is only the tip of the iceberg.  The more pressing problem is that by continually creating corporate documents from scratch, companies run the risk of producing external and internal communications that are inconsistent in style, appearance, and - even worse - message.  The ramifications of these shortcomings can be disastrous, particularly with respect to industry-specific compliance issues.

Consider a financial institution that generates vast amounts of investment offers for its customer base.  Disseminating the most up-to-date information quickly and accurately to end users is critical.  However, if a law is changed or the restrictions on an investment vehicle are revised, and the material reflecting these changes is either inaccurate or not distributed in a timely manner, an institution leaves itself vulnerable to potentially massive fines for non-compliance – perhaps even a litigious situation.

Clearly, such examples, as well as less dramatic ones, highlight the need for content management within most organizations.  However, many of these entities are relying on content management systems (CMS’s) that are woefully insufficient or no systems at all!  Frankly, many organizations that are authoring and managing huge volumes of content are using nothing more than simple desktop products.  What they’ve realized, whether it’s triggered by compliance issues, lack of resources to manage the problem, fear of litigation because of inconsistency in the content, or the increasing cost of content translation (localization) is, “We’re not doing this very efficiently.”

At its core, the primary reason people consider component content management is that there is duplication of content and they realize that they can save time and money, as well as increase the consistency of their internal and external communications, by simply reusing previously approved information.  Further, it holds the promise of improving the speed and productivity of the people producing these communications.  There are “trickle down” effects as well: editing is easier and translation to multiple languages is simpler, to name just two.

When company management reaches the conclusion that reuse is not feasible using the current tool set, they explore the idea of reuse with increased diligence; what they find is a fundamental difference at the very core of competitive systems.

A Topical Approach

Most content management organizations promote the concept that in order to reuse content you must segment content into topics. This approach works well for technical information because with technical content you are describing concepts, asking people to perform tasks or follow steps, or providing reference material.  Consequently, you can reasonably and easily create topics that represent concise ideas, and ultimately, small chunks of content.

However, while people might comprehend the benefits that topic-oriented documentation provides, they generally don’t grasp the downsides of such an approach.  One of the first requirements that need to be fulfilled in order to utilize the topic-based method is for people to start examining how they write.  They must figure out how they’re segmenting topics.  They also have to write in a style that is consistent, so that when other personnel are assembling documents, the documents sound and read like they’ve been written by the same group of people.  But this is not always possible from document to document.

For instance, a user document is typically written in a “second-person” format, i.e., “you will do this or that.” Conversely, a sales proposal, or a response to an RFP would normally be developed in a “third-person” format, i.e., “the user will do this or that.” This minor difference in the way diverse types of documents are written presents a formidable challenge in whether topic-based reuse will be practical or not.

Secondly, topics must be crafted in a way that makes them reusable so they can be slotted into any number of documents without causing problems of context.  Then, people have to be disciplined enough to go out and actually seek the information they need and place it in their documents.  Obviously, it is essential to have access to tools that will support this task, but it’s even more crucial to have the required discipline in order to gain the benefits that the topic-based content management promises.

Few content management companies truly understand the problems behind topic-based content reuse.  In the end, all of the problems actually come down to one factor: people.  Either people are unwilling to take the time to seek out content similar to what they’re writing, or they don’t even know it exists, so they don’t think to look for it.

How can these problems be addressed?  If the information is made even more granular, then it’s even harder to work with.  Part of the reason topics became difficult for people to work with is that when they’re used to working with a 100-page document, a two-paragraph piece of information seems too small, too granular to deal with.  What’s more, the time saved through the reuse of content – by reducing the time spent on other parts of the writing and editing process downstream – is negated by the time and management overhead required to deal with these minute pieces of information.  And the more granular the information, the more burdensome it is to manage.  If you can’t save money, there’s no point in doing it.

Even when a writer finds the topic that he or she is looking for, making it usable to the next person is a job in itself.  Let’s say you’re writing some marketing content for a website.  You know someone has written something similar, but you’re not sure where it is.  Using traditional search methods, you find the piece you were after, but it was inside a larger topic describing the subject in far greater detail than you needed; two of the paragraphs were exactly what you wanted to use, but the rest of it is superfluous.

In order to reuse just those two paragraphs using a topic framework, you would first have to turn those two paragraphs into a separate topic.  Then you would nest that topic in the original place you found it; you would then have that smaller topic to reuse where you wanted it in the new document.  What this all means is that if you’re creating topics, not only do you have to create a topic that makes sense for your current purpose, you must be prescient enough to guess or foresee the ways people might use that content in the future, possibly in more granular form.  Clearly, this is not a practical exercise; the amount of effort to break up the desired content and “re-reference” it is substantial - to the point where you’re simply going to copy and paste it.  In a topic-oriented world, that creates duplication, the very thing you’re trying to avoid.

Paragraphs Preferred

The obvious question, then, is, “Why not just save the paragraphs and forget the topics?” In other words, go to the paragraph level, but do it in a way in which the user doesn’t have to really do anything.  There are a few products that do store content in paragraphs rather than topics.  As a result, if a writer was to copy and paste a paragraph into another topic and save it, the system would say, “I’ve got that paragraph; I’m going to reference it.” In this way, content is never duplicated, and all identical content residing in the company’s database is instantly consolidated.

Even fewer of these products can go to the next level.  When the developers of these products started looking at paragraphs lined up next to each other, they saw many that were very similar.  They may have very tiny differences that were not obvious to the user, but the computer could easily identify them.  So the products were equipped with a background algorithm that compares every paragraph to every other paragraph in the database and generates a “matrix of similarity.” Through visual highlighting, the user is shown similar paragraphs to the one that is being written; most of the products use a color to show paragraphs that exhibit similarity of 95% or higher.  This affords the user the opportunity to consolidate similar content into one way of saying it.

Sometimes, the only differences between paragraphs are white space, punctuation, or capitalization.  These elements are placed into a special category called “Exact Match,” while the others are called fuzzy match.

Of course, it is also important to prevent people from creating these differences in the first place – that is one of the primary tenets of successful content reuse.  Consequently, it is optimal if these differences can be circumvented at the most effective time possible: the actual typing process.  As the writer types, similar content is needed, so the writers can choose to reuse content at the point of content creation.  If you analyze the value and investment of a CMS, the only opportunity you have to save financial and human resources is at the time the user is typing; once content has been typed, time has been used, and the opportunity evaporates.

There is an ancillary benefit to an effective CMS.  Not only are users prompted with suggestions of available, matching content, it also becomes obvious to users how certain documents are written.  The writer might be composing a piece that is slightly different from the corporate “norm,” but as soon as suggestions of content are presented, it is quickly apparent what the proper style is for that type of document.  The writer might well create a unique manuscript, but it is still going to be written and structured the way other content is written and structured within the company, department or division - not from a language or technical point of view but from the perspective of actual sentence structure.  Thus, there will be consistency not only in reuse of content but also in style.

It should be noted that while it is preferable for a CMS to operate on a paragraph level, the ideal CMS can function on a topic level as well.  Once a solution has offered content suggestions at the paragraph level, it will optimally allow the user to view all of the contexts, or topics, in which the information appears.

The Human Factor

As pointed out earlier, even with the most powerful content management tools offering content suggestions to the user at the paragraph level – without the user being burdened by the need to search or even ask for it – there are human factors that can prevent a successful system implementation.  The net effect is that many of these systems never reach their true potential.

The range of human factors is almost as diverse as the humans who use them.  Most are thinly veiled excuses that, unfortunately, mask some writers’ insecurities, biases, or inability to adapt to new technology, or even change in general, such as:

  • “John’s new with the company; he didn’t know we had that kind of system.”
  • “I knew we had the system, but I just figured it would be easier to write the document over again.”
  • “No one can write as well as I can – I don’t want to use other people’s inferior material.”

Clearly, the goals of the organization, at least in the area of document creation, are sometimes at odds with the goals – or at least the approach - of the people who work there.  There are a number of reasons this can occur, as stated above.  But in the end, there is usually one overriding factor that serves as the foundation of all the adoption issues: employees don’t view content as a corporate asset that takes time and resources to create.  And because few people consider it an asset, few people track it and manage it the way they would manage, say, the company’s financial resources.

In the final analysis, the best content management tool will not succeed unless it is easy to use and people are willing to use it.  What’s more, there has to be a set of initial guidelines set down at the corporate or department level that people will actually follow.  If they don’t adhere to them, the entire exercise is a waste of time.

Granted, this will involve some work, and it might seem that the initial effort to integrate the system into the corporate culture will make the overall task of creating documents harder than before.  But if done correctly, it will be the classic case of taking one step back to take 10 steps forward.

About the Author
Paul Trotter is the founder and CEO of Author-it Software Corporation. He is a sought-after presenter and well-known expert on the subjects of single sourcing, component content management, collaborative authoring, and localization. Contact Paul

Editing Modular Documentation: Some Best Practices

The Content Wrangler - Wed, 2008-10-08 11:40

By Michelle Corbin and Yoel Strimling

This article presents eight guidelines that we consider to be best practices for editing modular documentation. These guidelines are based on both the key concepts of modular documentation and on our combined experiences of editing modular documentation. We also make three concrete suggestions about how editors can be involved in the process of creating modular documentation. We believe that these guidelines and suggestions can help both writers and editors work together to create clear, consistent, and usable modular documentation.

Much has been said about the creation of modular documentation – from content management systems, to information architecture, to delivery forms, to the usability of modular content (content being easier to use, easier to understand, and easier to find, and so on. However, not much has been said about the editing of that content, and what the editor’s role is in such an environment.

Illustration by: Leo Blanchette - Fotolia.com

Researchers suggest that writing modular documentation is different from writing linear documentation because writers are now developing content as opposed to writing in the traditional sense. Therefore, editing content requires developing a mindset different from that of traditional editing.

The Mechanics of Modularization

The main goal of technical documentation is to provide access to technical content in an easy, efficient, and logical way. By separating descriptive from procedural information, organizing this information into discrete, standalone modules called topics, and then linking related topics to each other, writers can help readers quickly and easily find and use the information they need. Writing in a topic-based modular manner also helps writers better organize, construct, and write their documentation because it forces them to think about how to present the information in a clear and succinct way.

Modularization is based on these main concepts:

  • Chunk text into logical standalone topics
  • Label topics with clear and meaningful titles
  • Link related topics to each other

Chunk Text into Logical Standalone Topics
Chunking is the process of categorizing blocks of information into independent standalone topics based on their content type, and focuses on separating descriptive information from task-oriented information. This separation helps readers access only the most important and relevant information, structured in a concise and easily readable format.

The three basic types of topics are concept topics, task topics, and reference topics:

  • Concept topics provide background information that readers need to know before they can successfully understand and use a product or service. These topics describe concepts in a descriptive or narrative format, and answer the questions: What is it?, What does it do?, How does it work?, and Why is this important?
  • Task topics provide sequential, step-by-step instructions that describe how to do something, that is, they answer the question: How do I?
  • Reference topics provide additional detailed explanatory information. These topics usually present the information in a structured lookup table or list format.

Label Topics with Clear and Meaningful Titles

Labeling is the process of creating unique, clear, concise, and accurate headings that correctly identify the content included in a topic. This structured format further separates descriptive from task-oriented information and provides immediate visual clues to orient readers, enabling them to easily search for and access the information they need.

Link Related Topics to Each Other

Linking is the process of connecting topics to other related or relevant topics, which enables readers to easily jump back and forth between related subject matter in a document and to find the information they need. While the hierarchy represents the structure and flow of the different standalone topics, the links are the glue that holds all of the different topics together to provide meaningful content.

Best Practices for Editing Modular Documentation

Based on these three main concepts of modularization, as well as on our combined editorial experiences, we have identified the following guidelines that we consider to be best practices for editing modular documentation:

  • Topic types must not be mixed
  • Topics must be standalone
  • Introductory information must be clear and to-the-point
  • Topics cannot be too long
  • Paragraphs must be short
  • Titles must be unique and descriptive
  • Related topic links must be meaningful
  • Topic collections must be useful and reader-focused

Topic Types Must Not Be Mixed

As stated previously, modular documentation must be chunked so that descriptive information (concept and reference topics) is clearly separated from task-oriented information (task topics). Readers who are looking for information about how to do something do not want to wade through too much descriptive information to get to what is relevant to them. Similarly, readers who want to know detailed background information about what a product or service does do not necessarily want to see step-by-step procedures about how to use it.

Editors must be aware of the difference between descriptive and task-oriented information when editing modular documentation. For example, when reviewing a task topic about how to reset the status of a monitored system process, we need to make sure that it contains only the step-by-step procedure readers need to carry out the task successfully. There must be no information about how the utility that does the system monitoring works, no information about other things the utility does, and no information about how to configure the utility.

An important point must be made here, though. All task topics need some sort of brief (one- to three-sentence) introduction about the purpose of the task and its context. This descriptive information is an integral part of the task topic and is not a separate, standalone concept topic.

Topics Must Be Standalone

Because modular documentation is made up of chunked topics that are not read in any particular sequence, each one must be standalone. Readers must be able to understand the topic they are reading without having to read something else, that is, all the information they need is located in this topic.

However, topics should not repeat the same background information over and over again. To reduce this risk of repetition, only the most relevant information is written in the topic, and then cross-references to where readers can get more details are provided (links between topics are discussed in a subsequent guideline).

Editors must determine how self-contained a standalone topic must be and how much repetition of information is needed. When we edit these topics, we must keep in mind the question, “If readers started reading at this topic and had not read anything else prior to this, would this make sense?” Continuing with our example of the task topic about resetting a monitored process’s status, we need to make sure that the answer to this question is yes. Readers must have enough contextual information about this task to understand and carry it out (provided by the brief purpose information before the procedure starts), and if they want more information about the utility, there must be a cross-reference to the relevant concept topic.

Introductory Information Must Be Clear and To-The-Point

Chunking documentation into meaningful standalone topics requires the information in the topics to be organized in a logical and usable order. This chunking of information means that the first paragraph of a topic is the most important paragraph because it states the purpose of and summarizes the information presented in that topic. Procedures especially require standard introductory wording. This introductory information helps readers know if they are in the right place for the right information they need. The first sentence of the topic must also be clearly and directly related to the title, so readers can immediately see the connection between it and the topic content.

This first paragraph can also be used in building search terms in searching systems, and in some display implementations (such as the IBM® Eclipse Help System), the first paragraph can be used as hover help for links and as descriptive text for embedded child links. The introductory information serves the topic itself, and it is reused for many other retrieval techniques.

Topics Can