As we have discussed in our Hadoop Series, more and more companies are considering Hadoop for storage and management of documents and files. Just like our ECM clients, companies storing documents or scanned files in Hadoop want to provide PDF renditions of documents for easy viewing and other PDF capabilities. This post will discuss how Adlib can be leveraged with Solr/Lucene behind TSG’s OpenContent layer to provide robust ECM capabilities for your Hadoop repository.
Hadoop Document Transformations Using Adlib
In our series exploring the use of Hadoop for ECM, the best practice from our years of ECM experience tells us is that documents should be stored in both their native content as well as a PDF rendition of the content. Storing a PDF rendition allows consumers quick access to view the content, as well as being able to watermark and control the content to prevent consumers from altering the documents. This post will explore TSG’s partnership with Adlib and how we are using Adlib’s PDF conversion suite to transform documents being stored in Hadoop.