As we approach the two year mark of the economic downturn, we are proud to say that TSG has survived (and even profited) throughout the downturn. This post will share some of our thoughts as to how we survived as well as positioned us for growth in 2010.
This is the third post in our series on migrating from Documentum to Alfresco. As we pointed out in our disclaimer in our initial post on this subject, we are not saying Alfresco is better than Documentum, just different.
Which Interface to compare?
The challenge in comparing Documentum to Alfresco interfaces is both solutions offer a variety of interfaces including:
- Traditional “Library” Interface – For this interface, Documentum has Webtop and Alfresco has Explorer. This interface can be seen as a “do all” interface with cabinets/folder navigation as well as search, check-in/check-out and all other ECM functionality.
- Collaborative Interfaces – For this interface, Documentum has CenterStage and Alfresco has Share. Both interfaces are somewhat “SharePoint” like with the idea that, while still exposing document management functionality, the users can set up sites and collaborate on content.
- Client-Side Access – Documentum has recently shifted users to the MyDocumentum suite of products to handle interfacing with Microsoft Office products including Outlook. Alfresco offers similar interfaces with Office products as well as you have the ability to interact with your repository as a mapped drive (CIFS). Both of these interface expose the repository to user by enabling the to do common features such as check-in/checkout, search, convert to PDF, etc.
- Case Management – Documentum is very focused on xCP being the interface for Case Management. Alfresco relies on third party Open Source vendors to supply this type of solution.
- Third Party Integrations – Both interfaces have partners that have developed additional interfaces.
As we mentioned earlier in the year, we have many clients looking to push content out of Documentum for consumers. We are currently deploying this type of application for a client with specific requirements in how to display the date in the consumer interface. For many of our web applications, we leverage client utilities (YUI, GWT, etc) to expedite the user interface development. During testing of the application, we uncovered a bug that was easily fixed but the root cause of the issue is something to keep in mind as you are building web applications and relying on client side processing to generate information.
In addition to the enhancements to HPI disscussed last week, we are also enhancing the search capability in HPI for a large Documentum pharmaceutical client. Features will include:
- Lucene Integration: enhanced compatibility of HPI and Lucene through TSG’s OpenContent web services layer
- Save Search: save, recall, run, delete and edit a saved search
- OpenSearch Integration: execute a Lucene search (full text and metadata) and return it in an OpenSearch compliant RSS feed that includes a custom namespace to allow for exporting custom metadata in XML format
- Enhanced Search Controls: new multi-select control that includes a type-ahead feature
- Direct Linking: enable direct linking to content via a URL
- Doc Management: one-click links to view versions from the search results
- Single Sign On: leverage Kerberos to perform automatic application authentication
Core components of these enhancements will be rolled into our HPI 1.4 feature set and will be available later in 2010. For more information and to download HPI or view recorded demos. Please visit http://www.tsgrp.com/ and our Learning Zone.
As mentioned in a previous article, many clients are moving to away from FAST in preparation for the eventual release of Documentum Search Services (DSS) slated for release in June that leverages the open source product, Apache Lucene. This post will share the results from one client that executed a proof of concept test to compare the two search engines.
Proof of Concept Approach – As we have mentioned before, many clients have decided to implement an external cache outside of Documentum to address business continuity, performance and licensing issues. For a large pharmaceutical client, TSG was tasked with performing a proof of concept on 156,000 documents in an external data source indexed by Lucene. The proof of concept would compare search results of FAST within Documentum (Webtop) and Lucene (HPI) outside of Documentum in regards to search results. The proof of concept additionally evaluated leveraging Lucene for metadata storage rather than storing in another database such as Oracle.
POC Findings – Lucene/HPI and the external repository was found to be considerably quicker that the existing FAST/Webtop implementation on most queries.
|1200 Results||90 seconds||3 seconds|
|8 Results||5 seconds||3 seconds|
|10 Results||8 seconds||4 seconds|
|76 Results||10 seconds||5 seconds|
|5100 Results||72 seconds||5 seconds|
|65 Results||6 seconds||3 seconds|
Simple configuration of the Lucene index did a better job of returning a more complete search result set than the standard FAST/webtop configuration. Examples included additional documents that were logical derivatives of the initial search word. For example – a search for “exception report” could return “exceptions report” or “exception reports”. The proof of concept data set also included German documents and Lucene demonstrated multilingual stemming capability.
Key Stats – Lucene
- 156,000 Documents – 31.6 Gigabytes
- Total Index Space – 521 MB
- Total Index Build Time – 10 hours – The client was very interested in the time it took to index the content and metadata in Lucene because they had experience lengthy indexing times with FAST in their 5.3 upgrade. This was tracked as part of the proof of concept, however, the corresponding FAST data is no longer available from the 5.3 upgrade.
FAST and Lucene – Full Text Syntax Differences
- “One Two” – will return documents with the exact phrase “One Two” in the document
- One Two – will return documents with the words One OR Two in the document
- One+Two – will return documents with the words One OR Two in the document
- One and Two – will return documents with the words One AND Two in the documen
- Lucene – Based on the Proof of Concept’s configuration
- “One Two” – will return documents with the exact phrase “One Two” in the document
- One Two – will return documents with the words One AND Two in the document
- One OR Two – will return documents with the words One OR Two in the document
- One and Two – will return documents with the words One AND Two in the document
- One+Two – will return documents with the exact phrase “One Two” in the document
Overall the client was very satisfied with the findings and is moving forward with the solution. The flexibility of Lucene to index both the metdata and full-text values allowed the client to avoid adding an additional Oracle database to their external cache for attribute storage. The client also liked the more simple, intuitive search interface of HPI compared to the Webtop interface.
In addition to leveraging Lucene for searching an external cache, we are also working to leverage Lucene for internal Documentum/Webtop search.
If you have any questions or would like more detailed information, please contact us or comment below:
As mentioned in a previous blog in regards to email integration, we just finished a prototype/proof of concept initiative for a manufacturing client looking at content management and specifically Alfresco for the first time. This initiative reminded me of common mistakes made during this phase I have seen when planning any content management system (Alfresco or Documentum). In this blog, I thought I would highlight how we avoid falling into these pitfalls.
“If you build it, they will come”, while catchy, isn’t always true
The key for any content management initiative is to involve the knowledge worker in the process. Users play a key role in determining the success or failure of any content management initiative. Generally speaking, knowledge workers do not have to use the system. They have access to email, LAN drives, SharePoint and many other ad hoc tools they will choose to use to work on content. Engaging them in the process will help mitigate the risk that they will not use the system you are building.
Think “Outside of the Box” but don’t get stuck in the “Out of the Box” rut
It is not uncommon for IT to take the reins of a content management initiative and assume they know the best solution and “consult” the business late in the process. They often coin their system with phrases like “Pure Out-of-the-Box”, “You must change your process to fit the tool”, “We are outsourcing development”. Rather than take this hardnosed approach, the business and IT should collaboratively think outside of the box and determine how the tool can work for them not the other way around. Most successful implementations are derived from early prototyping to define meaningful system configurations, add-ons or smart customizations. IT alone cannot drive this process.
Too many cooks in the kitchen can be a recipe for failure
We’ve stressed the importance of IT and business working together; however, it is also important you select the correct representatives from IT and business to participate. A common mistake is to involve too many people early in the process and try to design a system with this large group. This approach often slows the process down for many reasons: unable to reach consensus, group think, focus on atypical situations and personal agendas to name a few. A more successful team has a focused membership of experts (IT and business) that represent the organization from a broader perspective rather than just their area. Once this group has determined a direction, the proposed solution can be taken to a larger body of people for review.
Lose the arrogance and gain humility
From our perspective, the business and IT have a love-hate relationship. Clearly, one can’t exist without the other; however, it is hard for either side to completely trust one another. Just like building trust in any relationship, arrogance will not get you very far. IT should avoid throwing out aggressive sayings to the user community when trying to build their trust. The quickest way to build trust with the user community is to demonstrate that you are listening to their issues, and for many clients, providing justification in spending their money. Don’t assume you understand their process but rather pay attention and adjust the approach based on what you hear. This can be done through iterative development which will builds trust in not only the system but those required to use it.
Users make the best salespeople
We’ve found if you engage the user community correctly through the planning and design process, you have a greater chance of success when you roll out the system. Key users can act as cheerleaders of the system; however, this enthusiasm must be sincere. By establishing user ownership early in the process, users will feel they were an integral role in implementing the system and advocate its use better than any training manual or memo.
We had an open-forum discussion on Document Control at the most recent Midwest Documentum User Group meeting. The discussion focused on feedback from various members of the group as well our experience over the years. We received positive feedback that the open discussion format, with multiple people participating, was insightful and are planning on using a similar format to discuss another topic at our next meeting. Below is a list of topics covered with highlights from the discussion:
- Document Properties – What are typical configurations and when do people customize
- Document Versioning – Rules around major and minor versions, when to delete
- Document Lifecycle – Flexible lifecycle to handle published and draft versions
- Security – Typical roles and responsibilities in Document Control
- Search – Simplifying search – simple search interface and consumer only viewing
- Document Viewing – Various ways in which companies have applied “smart” document overlays
- Change Request Package – CR data capture and relationship to change documents
- Review – Assignment of reviewers and methods for capturing review information
- Approval – Discussion on how to simplify approver selection and other common approval practices
The entire presentation is also available from the MWDUG website or feel free to contact me (email@example.com) if you want more information on responses and comments received from the MWDUG participants .
While majority of the work we do is Documentum or Alfresco related, we recently had the opportunity to help a client re-structure their SharePoint implementation. They are a small company that adopted SharePoint several years ago, however, they were unhappy with how the system was being used (or in essence not being used). Each user had individual control over what and how they stored documents in SharePoint therefore treating the system as a glorified file share. It was unclear what content existed, how many versions (or copies) of it existed and in short, very challenging for anyone other than the owner of the document to find anything.
While SharePoint typically isn’t our specialty, the principles behind content management are. We used our experience in this area to implement basic SharePoint configurations in order to have better control over their content. First we worked with the client to define an enterprise taxonomy which included the following steps:
- Define the main business processes in company.
- For each business process, define what types of documents they create.
- For each document type, define what key data points they want to capture.
- For each data point, define possible values.
This structure enforces that “like” documents are stored together as well as have the same definition through site columns and fixed lists of values. The end result makes it easier for users to find documents through search or navigation as well as mitigates the risk of duplicate work. These changes provide users intuitive means to find documents rather than the legacy method which consisted of calling someone, recreating it or hunting and picking in the system.
This structure was then used to configure process driven document libraries and search screens. Within the libraries we were able to leverage the required site columns to define document views in order to provide optional means of looking for documents. Since we removed the ability to bury documents in a folder structure, this configuration provides users the ability to create “virtual” folders that are meaningful to them.
By restructuring documents by process rather than department, it also allows for change in the future. As new processes are added or even changed, they can easily be added to the current configuration. Additionally, if departments are restructured, it has no impact on the way the documents are organized.
The biggest challenge for this project was taking the current list of documents (upwards of 50,000) and determining where they fit in the new taxonomy. Since they were coming from an unstructured system, there was no easy way to automate this process – user’s had to manually determine where a document belonged in the new structure. This tedious task could have easily been rushed or eliminated from the project; however the client made it a priority for a few reasons:
- SharePoint Cleanup – Since the old system was unstructured, there were a lot of documents that could be archived or deleted. Identifying these documents before migration cut the total count in half.
- Prove out the New Taxonomy – While the project team spent a lot of time defining the new structure, it was hard to prove that all aspects were considered. By determining where existing content belonged in the new structure, holes in the taxonomy were identified and accounted for prior to the system going live. This is where the ease and flexibility of configuring SharePoint worked in our favor. We could easily respond to changes in the system which I think will be a plus as the new process grows.
- Train Users – One of the best ways to learn and retain something is through repetition. The users assigned to tag existing documents had plenty of practice and exposure to the new structure. The idea is they can now take what they learned and be process champions to those not as exposed to the project. The user buy-in is critical for change management acceptance and adoption.
The users did a great job tagging all documents which then allowed us to migrate about 25,000 documents into the new structure. The process driven SharePoint has now been in place for over a month and the feedback has been positive.
“How to Gain High Performance from a Simplified ECM Search Interface”
On September 1, 2009, we will be co-hosting a webinar with Alfresco. We will be discussing our approach to address common search interface issues and how our simplified search solution, HPI, will help better realize the value of your Alfresco system.
Sign up for the webinar today!
On Monday, we were proud to accept an award from the National Association for Business Resources (NABR) for being one of the 101 Best & Brightest Companies in Chicago to work for. While we all love working at TSG, it was nice to validate that the work environment we foster is recognized as a leader in today’s workforce. Being a smaller firm, we were honored to be in the presence of other winners such as Verizon, Earnest & Young, Astellas, Morningstar and KPMG.
Despite the current state of the economy, the awards luncheon was energetic and it was refreshing to know that many companies are still committed to their employees and making their working experience more than just a job.