Documentum Cross-Repository Searching – an integrated open source approach

We just completed a very interesting integration for one of our clients to enable searching across multiple Documentum repositories.  The client was struggling with the Federated Search from Documentum and was looking for a better solution to access two separate Documentum instances with a cleaner search.  This post will detail our approach and some of the benefits of leveraging an integrated open source solution over a packaged solution from Documentum.

World Browse but limited Read Access

One interesting requirement was that, for most content, all of the users would be able to find the document in search, but may or may not be able to view the content.  This requirement was difficult to accomplish with standard Documentum as we didn’t want to update an already complex ACL structure (the client has pretty involved ACLs) to world Browse as well as add users to each docbase.

The TSG solution was constructed in such a way as to allow all users to see search results for all documents in the repositories, but upon clicking on the requested document, either the PDF rendition of the document would be launched or, if the user didn’t have read access, the user would be advised how to request read access.

Single Sign-on

Another interesting element was how the company’s current single sign-on could be leveraged to access the read-only listing and then leveraged again to access the document itself.  The process went as follows:

  • User would log on to the company environment as they normally would.
  • The user would select the new cross repository search that would not require a Documentum login.
  • The user would search across multiple repositories and receive a listing of documents.
  • If the user selected a document, the cross repository search would attempt to log the user into Documentum via the single sign-on and retrieve the PDF document in a browser if the user had access.  If the user wanted to edit the native content or other Documentum function, the browser contained a deep link to the Webtop location of the document and Webtop could be launched in another browser window.
  • If the user did not have access to the Documentum instance they would be advised how to obtain access.

Cross Repository Infrastructure and Approach

The components for the cross repository search included the following items:

  • Lucene/Solr – we installed a separate instance of Lucene/Solr to hold both the meta-data and full-text search indexes.
  • OpenMigrate  – was responsible for indexing all the content from the repositories for the initial Lucene index as well as ongoing indexing of new or revised content.
  • HPI  – was leveraged for the configurable search interface for retrieval as well as passing the link to Webtop instances.

Cross Repository Search

Advantages to the approach

The approach has a couple of key advantages over leverage of traditional Federated Search components from Documentum including:

  • User Friendly Search Interface – We have discussed this on this forum many times but the typical “build a search” Documentum search from Webtop or other Documentum tools is not very user friendly.  See related post comparing Documentum Webtop, D2, XCP and HPI.
  • Webtop Isolation – With the approach, nothing in the multiple Webtop instances need to change in any way.
  • Retrieval Only/Deep Dive into Webtop only when needed – The Federated Search approach from Documentum assumes an author role that someone would want to search and edit documents so they get Webtop or other interfaces in their full-function mode.  Most users are more interested in quick retrieval and can be hampered by the full-featured interfaces.
  • Performance – We have seen this many times with our complete Caching model, but typically performance of Webtop and Documentum in general can be tied back to expensive queries.  By leveraging a separated Lucene Instance, Cross Repository searches are isolated from Documentum without any performance impacts.
  • Future Expansion – Other Repositories – While the focus of the first implementation was on multiple Documentum instances, the environment can be configured to capture and index multiple non-Documentum documents as well.
  • Full Text Indexing – The ability to conduct full text searches is not lost in the cross-repository search.  Open Migrate  triggers an indexing event in Solr every time a document is created, modified, or deleted in Documentum so that index captures both full text in addition to all document metadata.

Summary

Leveraging an integration and open source approach for cross-repository searching has several unique and interesting advantages over full-function tools from Documentum regarding user interface, performance and expansion capabilities.

Let us know your thoughts below: