Documentum Performance – Search, Retrieval and Inbox

I was talking to a client last week that was struggling with their Documentum system performance.  This client had recently brought in Documentum Consulting, and having been unsuccessful in improving performance, was concerned that there might be nothing left to do.  This post will focus on innovative things TSG has done for clients in regards to Documentum performance that are probably outside of the norm of typical database/server tuning particularly focused on Search, Document Retrieval and Inbox viewing.

Problems with the “Simple” search

In discussions with the client, many Webtop users leverage the simple search on the front page of Webtop.  While users might ask for a “Google” free form search, we have pointed out in a previous article, that often ECM searches are not like a Google search and users are just concerned that:

  • Webtop Front-Page “Google  Like” search is slow and inaccurate.
  • Advanced Search requires users to select a lot of items including document type (from a large and difficult looking list), attribute (from another large and difficult looking list), operator (=, <, >) and type in value (sometimes accurate).

In working with clients, TSG routinely offers search alternatives or customizations to address the above issues.

Problems with Document Retrieval

Issues with document retrieval are similar to search but can’t always be fixed with standard database type issues as they involve:

  • Document Format Type – Certain types (ex Word) require the native application to launch and view.
  • Document Size – The size of the document affect both storage retrieval and network transmission
  • Document Transformation – Some documents are converted on the fly (thumbnails, PDF) and require back-end processing which can delay viewing.

Problems with Inbox Performance

Another issue affecting the client was significant performance issues with inbox viewing due to increased volume of inbox entries.   While not as common as Search complaints, we have seen a number of different clients express concern with inbox viewing.  Some common issues:

  • Workflow searches are often not very efficient.  Like search, the more workflow in the system, the slower the search will take.  Unlike search, clients are unlikely to put a database index on a obscure workflow field.  Also unlike search, xPlore does not index workflow objects.
  • Inboxes are often customized.  Typical logic for security and adding fields are customized in the inbox.  This can include looking at documents for a due date (and putting in the inbox).  These queries add to the complexity of the inbox processing and, while not noticed with two or three entries, can significantly degrade performance as the inbox fills.

Documentum Search Performance Tips – Start with Documentum Search from a User Perspective

Our most common user complaint in regards to performance is the concern that “the search is too difficult and the results take too long”.  Whether D2, Webtop, xCP or our HPI Interface, users need:

  • A simplified interface (User Performance)
  • Quick Performance (System Performance)

For a simplified interface, we see clients looking to configure or customize the typical user choices in the search.  For example:

  • Document Types – if a user typically only works with three document types, the interface should default to allow them to only pick one of those three types.
  • Attributes – if a user typically only works with a couple of attributes, the interface should default to showing only those attributes.
  • Values – for attributes that only contain valid choices that the user can quickly select.
  • Results – too often search is all about how fast results are returned rather than how many.

While users might say they want a Google Search, they rarely want Google Results with tons of pages to review. For Quick Search performance

  • Leverage xPlore – xPlore indexes both the attributes along with the content.  Clients should consider using Xplore even if they are not looking for full-text search.  If you see xPlore’s Content Processing Service (CPS) bogged down, consider adding a secondary CPS dedicated to Search.  If xPlore indexing is bogged down as well, the CPS instances can be targeted to handle indexing, search or both.
  • “Equals” is better than “Starts With” is better than “Contains” – by default with xPlore, the “simple” search is doing a contains search against EVERY attribute in the system at the dm_document level.  Small changes like changing it to a “Begins with” and only searching on common attributes like object name and title will dramatically improve performance.
  • Leverage document objects properly – Simple search is performing a search on all dm_documents.  If a client has built the right object model, why not limit the search to objects that are relevant?  For example, in a 200,000 document system, train users to never use dm_document but rather a sub-object type (example sop_documents) that might only have 2,000 documents.
  • Simplify Security – Without going into too much detail, complicated security results in complicated database queries.  Certain security configurations with many ACLs can dramatically affect performance.
  • Cache those consumers – As Documentum use and the number of documents grow in an organization, performance can often degrade.  With different documents and groups, security can get complex and, regardless of any performance tuning, Documentum will check security on every document on every search.   Many clients have had significant success in caching public or light security documents outside of Documentum for read-only access.

Among the solutions above, caching is the most common “outside the box” tuning we see.  In addition to improved search results viewing performance, other benefits include:

  • Reduced Documentum system load – With consumers searching the cache, authors and approvers don’t get bogged down by complicated searches directly against Documentum.
  • Business Continuity – Documentum can be down (for an upgrade or other reasons) and search can continue.  See related post on Documentum Business Continuity.

To see some of our thoughts on best practices, access our Building a Consumer Interface for Documentum whitepaper  for some thoughts comparing Webtop to our best practice search interface HPI.

Documentum Document Retrieval Tips

For document retrieval, we typically see the issues addressed in the following ways:

  • Document Type – We are big fans of converting everything to PDF for quick viewing.  If viewed in the browser, typical first image views can be slow but are fast going forward.
  • Document Size – Working to reduce document size, particularly with image resolution, is fairly common.  For viewing – typically 200 dpi is best initially with the ability to access higher resolutions for the same document if needed.
  • Document Transformation –  We would recommend some of the transformations (image, PDF headers/footers) be added before view-time if possible.  For example, if a document is in a workflow, when the user acquires the workflow, begin the transformation process before the user selects to view the image.

Documentum Inbox Performance Tips

There is not an easy answer for inbox performance given that typical issues are the result of customizations and unique to the security and business process.  Some quick ideas:

  • Document information – We have seen examples where inbox items have due date or other attributes that have to be extracted from the attached documents.  If possible, putting these values on the workflow object (maybe message field) will improve performance.
  • Quick Display – One point has been to quickly display just the inbox items information and then use Ajax to gradually fill in the other values (like due date).  In this manner, the screen displays quickly without waiting for the other data.
  • Get Next – Rather than having users complete an item and automaticallly return to the inbox (resulting in it being regenerated again) some clients have chosen a simple “Get Next” button to just move on to the next item in the queue and skip the inbox itself.   We might do a blog post at a later time about the ups and downs of always letting users pick what they are working on (and skipping stuff they don’t).
  • Indexes –Just like other database tuning, look at the query that generates your inbox, specifically the where cause.  Talk to your DBA team to determine if additional database indexes would speed up the query.

Summary

Issues about performance can be creatively solved using a number of different processes, tools and approaches.  As presented above, some of the more innovative tips aren’t just about technical issues but require a deeper understanding of the business process along with technical alternatives.  If you have other thoughts or tips – please add below.

6 thoughts on “Documentum Performance – Search, Retrieval and Inbox

  1. Great blog! There are a lot of solid recommendations here, and I really like the search section. Thanks for sharing!

  2. Hi,
    You are absolutely right that search and inbox are the one of the most sensitive parts of the Documentum user experience.

    I want to comment that using dm_document on search is very uncommon, probably every installation requires custom types and Webtop customization/configuration to hide generic types and expose custom types instead.

    Regarding Inbox performance, I would comment on the need for dmi_queue_item objects maintenance and cleanup is a must. “Dequeued” items are not deleted only marked as dequeued or completed by the system, this includes the index server queue and all system’s users inboxes. They are all in the same database table. This is by far the biggest performance issue on Inbox. Even old/unneeded dmi_queue_items could be safely deleted to improve performance. You should also take a look at uncompleted/halted workflows too and abort them if they are not needed to clean the system’s dmi_workitem objects (another huge performance bottleneck). You can also work with table partitioning on Enterprise Ed. databases to improve performance. And, as you correctly said, database index creation is a must!.

    Regards,
    Jorge S.

Comments are closed.