LearnITy™ Knowledge Engine
Documents to Insights
Accessing embedded document data has become one of the most highly sought-after technologies in sectors such as financial services, real estate, insurance, government, and healthcare. These industries all share a central challenge: how can we automate document processing to extract the fundamental structured information they contain?
LearnITyTM Knowledge Engine (LKE) is a Document Understanding (DU) Platform addressing this challenge
What is Document Understanding (DU)?
Document Understanding (DU) is a set of technologies using Artificial Intelligence (AI) that can be used to understand and turn unstructured (e.g., plain text paragraphs) and semi-structured data (e.g., header, footer, tables, and lists) into a structured format, recognise and extract information despite different formats and continuously learn and improve over time for aiding decision making and process automation
Applications of Deep Document Understanding
- Information Extraction is the foundational application of DU where structured data embedded in documents need to be extracted based on various attributes of the data
- Document Categorisation and Search supports users who professionally search for information in document assets of their organisation (e.g., Patent applications, VoC)
- Document Validation (e.g., KYC/AML) checks for consistency of data elements in a document based on business rules and reference master data
- Document Comparison (e.g., comparison of Company Annual Reports) matches different data elements of 2 or more documents based on defined comparison criteria
How is LearnITy™ Knowledge Engine (LKE) unique?
- Apart from LKE, no solution in the market focus on the full spectrum of DU capabilities (extraction, analysis, validation, comparison, search)
- Out of the box solutions will not work for DU since no two organizations have the same DU requirements
- LKE provides a powerful DSL named Document Comprehension Language DCL (patent pending) using which complex DU requirements involving multiple, large documents may be expressed
- The domain knowledge of the business users have to be fully utilised in building and validating any DU solution
- LKE facilitates adding such knowledge via a number of mechanisms such as business rules, custom dictionaries, ontologies, etc.
- Most current offerings are industry vertical and application specific such for legal (Contract Understanding) or health (EHR Analysis) and are sold as full solution
- LKE is also offered as embeddable engine to ISVs and systems integrators
How LKE Works?
There are three phases in LKE operation:
1. Ingest the documents, split them into various components (sections, pages, paragraphs, sentences, clauses, tables, etc.), annotate these components in various ways, and save everything in a DB. This part is done by the NLP Engine. Apart from the built-in annotations performed by the engine (e.g., part-of-speech, dependency parse, NER, etc.) custom annotations (e.g., custom NER, custom relation extraction) are also supported.
2. The purpose of this phase is to configure various DU operations such as Information Extraction, Document Comparison, etc. For this the DU requirements are expressed in a declarative DSL named Document Comprehension Language (DCL). DCL is based on XML notation and contains various tags that are used to implement the DU operations. The annotations saved in step 1 are utilised by DCL. The DU requirements expressed in DCL syntax are saved as templates (text files) and are uploaded to LKE.
3. In this phase the actual DU operations are performed based on the ingested documents and configured DCL templates, and the results are produced in human digestible formats such as Excel/Word//PDF reports with corporate branding, and/or in machine digestible formats (JSON/XML) that may be passed to downstream applications.
LKE is a DU Platform – industry vertical and business function agnostic
- LKE is a Document Understanding Platform
- It is provided as a collection of capabilities that are not tied to any specific business function or business vertical
- It may be used to support the DU requirements of any business function, Finance, Legal, Operation, HR, …
- It may be used in any business vertical such as Banking, Healthcare, Manufacturing, …
- LKE empowers organisations of all sizes to turn documents into insights, without requiring them to big investment in AI/ML expertise
LKE – Focus on Domain Knowledge
- LKE enables quick addition of business domain knowledge to the document understanding process.
- We realise that the people who run the business have the best knowledge that may be harnessed fruitfully in making the DU process more effective and accurate
- LKE empowers business users to easily add knowledge that are specific to the business via a number of mechanisms such as business rules, custom dictionaries, ontologies, etc.
- Human in the loop is thus an integral part in all LKE capabilities (custom entity creation, business rule formulation, model validation, etc.)
How LKE generates output
- The internally stored results of various document understanding tasks may be used to generate a variety of reports in PDF, Word, and Excel making it accessible and easy to use by the business users
- LKE supports document templates (e.g., Excel and Word templates) that may be used to generate result documents based on your organizational branding and other guidelines
- LKE also generates the required information in machine readable formats (JSON, XML) for consumption by downstream applications
- Output may also be generated in industry-standard formats like SWIFT or XBRL
LKE – Under the Hood
- LKE converts text into a representation of meaning that can satisfy a broad set of information needs
- Documents are processed using technologies such as semantic parsing to create a Knowledge Base
- This knowledge base is complemented by external knowledge sources such Wikipedia as well as domain specific knowledge bases
- Logical reasoning is applied on the knowledge assets created to fulfil the document understanding requirements of the business