Document management solutions as we know them are doomed to disappear in the medium term, and the publishers of document management software will be forced to roll with the times, sell out or face certain extinction. The archivists and IT officers who work in this field will also see their roles changing or vanishing.
The original primary purpose of electronic document management (EDM) was to allow documents to be managed on the basis of a classification plan and retention schedule, and of course to make it easier to retrieve exactly what was needed from the vast stockpile of “archived” documents.
Example of an EDM application interface.
One of the most important landmarks in the history of EDM software was the emergence in the 1990s of major large-scale applications based on SQL databases. The market then gradually evolved through a series of ambitious customer-based projects, with major players including Filenet (acquired by IBM in 2006), OpenText, Stellent (acquired by Oracle in 2006), and Documentum (acquired by EMC in 2003 and then by Dell in 2016, which resold it to OpenText in the same year).
Projects that frequently overrun both budgets and schedules
A further key development was the launch of Microsoft’s SharePoint platform, an original open-API solution designed to capture as much of the market as possible. It was followed by open-source solutions (i.e. open-API by definition) such as the market-leading Alfresco. Although these solutions were also marketed to SMEs, the results they delivered were ultimately similar to those achieved by the “big hitters”; projects that often overran and whose costs were out of all proportion to client expectations.
- Large-scale solutions
These EDM applications were primarily designed to handle what was at the time regarded as large volumes of data with the help of SQL databases. No real thought was given to ergonomics; what mattered was keeping documents safe and retrievable.
The solutions often fell short of expectations, however, and scope creep in relation to business rules and integration-related (and subsequently infrastructure-related) constraints, exacerbated by exploding data volumes, resulted in a number of projects running out of steam. The open solutions referred to above, such as SharePoint and Alfresco, were quite rightly found to be too open and ended up imposing just as many constraints.
- Small-scale solutions
It was around this period that more modest solutions saw the light, developed on the basis of DBase or Microsoft Access. Small-scale software publishers identified a need for ergonomically designed user interfaces and off-the-shelf solutions, and tapped into a healthy market in the digitization and document archiving sectors, as well as among medium-sized companies and individual departments within larger organizations.
Scanfile (Spielberg), Laserfiche (Laserfiche) or M-Files (M-Files) have long been referred to as “small-scale” solutions; it may well be true that market barriers prevented them from demonstrating their true capacity to handle large volumes of data, but the advantage of these ready-made solutions was that they could be easily configured to the specific needs of each organization without any input from software developers.
- Additional functionalities
The document as the center of a network of modular management solutions and functions.
A number of software publishers integrated (more or less successfully) additional modular functionalities such as digitization, email management or document processes into their applications.
This all-inclusive approach was often driven not so much by software development, but by take-overs of companies active in fields such as process management, knowledge management or business intelligence (BI).
Drowning in documents
An approach of this kind is often primarily a marketing strategy aimed at winning more customers without improving the performance of the original solution. It is indeed the case that the performance of EDM solutions is in an irretrievable slump, mainly as a result of factors such as skyrocketing volumes of documents and the need for wider and deeper integration within the systems used by organizations. The final development which is worth noting is the first signs of concentration within the market, with the acquisition of Documentum by OpenText (making the latter the largest global player). What is now becoming clear, however, is that EDM is unlikely to feature in any company’s future toolbox of software solutions.
Constraints at every turn: performance, new legal challenges, data
EDM is lagging behind its own agenda, and has proved essentially incapable of delivering what organizations need. It is faced with major constraints such as infrastructure-related performance, and in particular new legal challenges which mean that documents can no longer be managed on their own; it is also necessary to manage the data within these documents, and even data held separately in databases.
Legislators are taking great strides forward in terms of protecting personal information and ensuring access to information.
Companies will soon be forced to answer the following questions:
- Are you familiar with the content of each of the documents in your possession?
- Can a full text search be carried out on all of these documents?
- Do you use your EDM system and its rules to manage documents being worked on jointly within business processes?
- Do you know how to find duplicate documents and different versions of every document within your information system?
- Do you know how to find a final (signed and archived) version of a document?
- Do you know how data and databases should be managed in order to comply with the retention periods which apply to documents?
- Are you happy with your current solution?
- A change on the horizon
In order to find a solution which can handle both old challenges (made more difficult by the massive volumes of data now produced) and new challenges which cannot be met by previous solutions or well-timed roll-outs of new features, software publishers must contemplate large-scale shifts in the architecture of their systems. These will involve firstly separating out processing activities, and secondly managing all governance-related documents and data at company level.
As far as software publishers are concerned, we will soon see what the future holds for two platforms brought together under a single brand; after several acquisitions, OpenText is touting for new customers with an all-inclusive offer, but will Livelink and Documentum users believe in its promises of all-round functionality given that the customer will ultimately be responsible for integrating these functions? We have learned that it is often preferable to purchase the best tool for each job and to integrate these tools using external tools not linked to the same software publisher. In order to keep a handle on costs, every organization must buy the tools which best meet its needs and then tackle the task of integrating and maintaining these tools itself.
- The technology is already out there
There is still widespread ignorance in the market about these solutions and these innovative technologies, not to mention a lack of willingness to make this shift, but we are not far off the point where there will be no other choice if we wish to achieve universally uniform document handling procedures and guaranteed regulatory compliance. For example, the following new legislation will enter into force in the EU in 2018;
- MiFID II (the Markets in Financial Instruments Directive), a piece of legislation which will result in far-reaching changes as regards the transparency of financial markets and investor protection;
- the General Data Protection Regulation (GDPR).
- What can be done?
Any steps we take should be guided by the following principles;
- EDM as we currently know it is ultimately a classification plan and a retention schedule, and little use is made of metadata. This outdated approach ultimately delivers little in terms of added value, and we often ignore the treasure trove of useful information that can be mined from the contents of our documents and their metadata.
- Those who know how best to leverage the information stored in documents will gain an extra strategic and competitive advantage. Reference material stored in other company systems will play an enormously useful role in enriching information across the board.
- Current EDM solutions do not tackle these problems effectively; what is more, solutions based on relational databases are incapable of handling unstructured data efficiently.
- We must focus on innovative solutions which make it possible to gather, check and correlate vast volumes of heterogeneous data – whether structured or unstructured – and to use the relevant data for predictive and descriptive analysis.
Shared use of data and processing capacities
The new opportunities for shared use of data and processing capacities will more than justify the effort involved in this architectural revolution; for example, information which is used and updated by one business process will be available in updated form to all the other systems within the organization.
Information fusion has become possible with the emergence of BI and ERP (enterprise resource planning), thanks on the one hand to ETL (1) and on the other to EAI (2), both now referred to under the heading of ESB (3).
Big data and artificial intelligence, utilized for increasingly automated processes, are currently fueling the growth of supersized data handling solutions.
The main features of an ideal solution in this field would include the following;
- the ability to manage and store huge volumes of structured and unstructured data (documents, multimedia files, emails etc.);
- the traditional functionalities which customers expect from an EDM system;
- the facility to handle both data and metadata;
- easy integration of any future and existing sources of data (EDM, ERP, relational databases, resources from the Internet etc.);
- non-disruptive integration with existing IT systems;
- the assumption that several EDM solutions will co-exist or be migrated within the same organization;
- easy correlation of unstructured and structured data from different systems;
- a powerful indexing and search engine;
- bitemporal functionality (4);
- a high level of security in order to guarantee data protection and confidentiality.
A revolution similar to the advent of networked servers in the early 1990s.
In this vision of the future, EDM will continue to exist in the form of rules within an architecture based on NoSQL databases, with statistical semantic and even predictive search engines used to create what is known as artificial intelligence. Information governance and interoperability will become reality.
Document (and therefore data) classification will be represented dynamically to users, firstly on the basis of classification rules and secondly on the basis of business and security rules, and document and data retention schedules will also be applied as rules. These developments—which in my (hopefully correct) opinion can be referred to as genuinely disruptive—are reminiscent of the advent of networked servers in the early 1990s, which replaced the mainframes or large-scale systems which had previously been the only options for handling the processing demands of the time. These networked servers, which are still in use today, represented a huge step forward thanks to reduced costs, greater deployment and maintenance flexibility and (most importantly) improved performance.
The changes I expect to see include the following;
- infrastructure will undoubtedly migrate to the cloud; the principal outcome of this development will be the disappearance of burdensome in-house hardware with the associated constraints of managing back-ups, upgrades etc.;
- this trend will see Office 365 and Google fighting it out for first place in the collaboration software market, with Microsoft likely to come out in front thanks to greater business penetration in the form of SharePoint and Dynamics. Like all cloud applications, solutions in this area must be accessible from anywhere and with mobile devices;
- CIOs will disappear, just like computer scientists have handed over the baton to computer graphics artists, SEO experts, marketing specialists and so on from the very first years of the Internet. IT management will be synonymous with information management, and will be controlled by the owners of the relevant business processes and data;
- the solutions currently in place will be superseded by those which can handle big data, databases will gradually shift from SQL to NoSQL, and a distinction will be made at architectural level between data and documents or processing;
- in five years’ time, a reference to a certain piece of data will give added value to the data itself by making it possible to find, manage and protect the data by means of associated reference documents. Data storage in the cloud will be a trivial matter, with n-fold reproduction and fingerprint archiving with third parties. Nothing will be static about the information systems used for dynamically changing data. Roll on 2022!