BusinessDay: DocInsights - sifting through data made easier

By Johan Steyn, 31 October 2022

There are many software platforms that allow us to search for and retrieve documents. We may need them for a legal matter or to review the contents to make a decision. But the challenge arises when you deal with hundreds of thousands of documents.

There is no way that a team of human administrators can make sense of trends and potential issues out of large volumes of data.

The other challenge is that even though documents are digitised by the use of optical character recognition software, workers’ ability to search for and find relevant data is limited. Unless you enter the exact phrase that you seek you are unlikely to find answers.

Dries Cronje, founder of Deep Learning Café, saw the need for a platform that used artificial intelligence (AI) to draw information from large volumes of documentation. It all started when his wife, a lawyer, received a 29GB disk containing more than 10,000 e-mails. Her client asked that she review the information and deliver a legal opinion.

Cronje created an intelligent search engine that allowed his wife to unearth data trends in the documents, and she was able to finalise an opinion within a few hours. This led to the birth of DocInsights.

Cronje told me that his team built a platform that makes use of AI technology to extract data from various document sources such as handwritten notes, PDFs, invoices, emails and even audio files. There are many similar platforms available and I was keen to hear how DocInsights is different.

“Most platforms use straight keyword searches. This means that your search will result only in the specific word or phrase you enter. What is lacking is contextual understanding. You will miss out on a large amount of important and relevant data by this kind of search methodology.

“Our platform recognises subtle semantic similarities. We employ natural language processing to identify phrases that are relevant with a similar meaning to what the user is searching for.”

Cronje explained that the platform is ideal for forensic experts and law firms. In fact, one of their customers is a large local intellectual property law firm.

“We have also done a lot of work with investigative journalists.”

A great use case was when his team ingested into their platform the thousands of pages from the Zondo commission report alongside what are known as the Gupta leaks e-mails. “We were able to show the relations between entities that human investigators were unable to find.”

What makes DocInsights unique is the ability for teams to collaborate on documents and to arrange their findings through a customisable, smart chronology builder. The platform also provides auto-backlinks to the original source files.

“Our platform was built in SA with the African market in mind. We made sure that it is affordable and very easy to use. Our clients find that they discover significant and important insights quickly.”

DocInsights is a proudly SA platform. Business users should take note. You can be confident that no detail is overlooked and that no page is left unturned.

• Steyn is on the faculty at Woxsen University, a research fellow at Stellenbosch University and the founder of


