The Hadoop Distributed File System (HDFS) provides the ability to store an enormous quantity of files with redundancy. In our first release of OpenContent for Hadoop, we have included the ability to annotate PDF documents with OpenAnnotate and store and retrieve the PDF layers in Hadoop. This post will describe the integration with Hadoop as the ECM repository, as well as highlight some benefits of using an annotation tool that uses open specifications.
Hadoop
Hadoop Web Service REST API for Enterprise Content Management using TSG’s OpenContent
Many of our ECM clients often develop their own Web Services layer to isolate their applications from the back-end repository as well as provide a vehicle to add in their own services to talk to other non-ECM systems. OpenContent was developed as part of our Documentum practice to give clients a standard web services architecture with an open source approach. OpenContent is now available for the Hadoop NoSQL database, HBase. This post will detail the web services available in our first release with examples and explanations.
Hadoop for Enterprise Content Management – Adding PDF Renditions with Adlib
As we have discussed in our Hadoop Series, more and more companies are considering Hadoop for storage and management of documents and files. Just like our ECM clients, companies storing documents or scanned files in Hadoop want to provide PDF renditions of documents for easy viewing and other PDF capabilities. This post will discuss how Adlib can be leveraged with Solr/Lucene behind TSG’s OpenContent layer to provide robust ECM capabilities for your Hadoop repository.
Hadoop Document Transformations Using Adlib
In our series exploring the use of Hadoop for ECM, the best practice from our years of ECM experience tells us is that documents should be stored in both their native content as well as a PDF rendition of the content. Storing a PDF rendition allows consumers quick access to view the content, as well as being able to watermark and control the content to prevent consumers from altering the documents. This post will explore TSG’s partnership with Adlib and how we are using Adlib’s PDF conversion suite to transform documents being stored in Hadoop.
Hadoop – Disrupting the Relational Database Component of ECM
We had a good conversation yesterday with a long-time and innovative TSG client. The client has a mix of technical and business skills that make him a visionary in a highly regulated industry in regards to Enterprise Content Management. In addition to our normal catch-up discussions about plans for the year and what are we seeing other clients do, we also talked about Hadoop and how it could disrupt traditional Relational Databases (RDBMS). This post will present highlights of that discussion from a business perspective.
TSG Announces Ephesoft support for Documentum and Hadoop
Ephesoft Document Capture capabilities now available for Documentum and Hadoop
Chicago, IL. – February 12, 2015 – Technology Services Group, Inc. (TSG), and open-source enterprise content management (ECM) solution provider, and Ephesoft, Inc., the creator of Smart Capture® , today announced Ephesoft support for Documentum and Hadoop. Continue reading
TSG Announces Creation of Hadoop Practice
Open Source Hadoop becoming increasing popular for ECM customers
Chicago, IL. – February 4, 2015 – Technology Services Group, Inc. (TSG), an open-source enterprise content management (ECM) solution provider, today announced the creation of a new practice area specifically focused on Hadoop and related technologies.
Hadoop – OpenContent/HPI Product Plans
The first step in supporting all of the TSG products on Hadoop is building our OpenContent REST Web Services layer to access Hadoop in the same manner we access Documentum, Alfresco and other content management systems. This post will present our plans and timelines for OpenContent along with associated TSG solutions.
Hadoop – Data Model for ECM applications
As we have talked to clients about Hadoop with HBase for ECM, too often we hear “but isn’t that just for big data?” This post will try to explain the benefits of Hadoop’s big data capabilities and data model in an ECM context compared to traditional database systems.
Hadoop – Why Hadoop as a Content Store when Caching Content for ECM Consumers
Last week we posted on a publishing approach for enterprise search. Along with enterprise search, we have seen more and more ECM clients look to publish content out of the ECM repository for a variety of business reasons including performance, business continuity and reducing costs. This post will highlight how Hadoop can be used within a publishing architecture and explain some of the benefits.