Main Content

We are looking for license partners to commercialise a number of legacy intellectual property (IP) projects from large scale R&D programmes. This R&D project, completed in 2009, uncovered a way of analysing text to extract valuable information into a single electronic format.

ITI Scotland (now Scottish Enterprise), with the University of Edinburgh School of Informatics, developed a combination of advanced text mining software technologies known as TXM.

Text mining harnesses computer power to enhance the process of analysing large quantities of unstructured text, identifying, locating and extracting critical information, and presenting this in a clear and succinct ways to facilitate the creation of comprehensive, searchable databases for novel discovery.

Text Mining, or Text Analytics as it is increasingly known, is both a technology and a process. It is a mechanism for the discovery of knowledge from documents. It is a means of finding value in text.

The technology mines documents and other forms of ‘unstructured’ electronic data. It does this by analysing linguistic structure and by applying statistical and machine-learning techniques to discern entities (names, dates, places, terms, proteins, etc) and their attributes, as well as relationships, concepts, and even sentiments. These ‘features’ are extracted to databases for further analysis, automated classification and processing of the source documents.

These databases use visualisation approaches for the exploratory analysis of the discovered information.

The commercial opportunity

TXM employs complex NLP (Natural Language Processing) technologies to analyse text and cross reference data with large databases of background knowledge. The scalable TXM system is accessed by means of a web-based interface which allows simultaneous users to be located on different sites anywhere in the world.

This allows:

  • Flexible, generic text mining system which creates structure from unstructured information
  • Automates extraction of valuable information
  • Centralises document management, in a single electronic format
  • Converts PDF source document to machine-readable XML format, performs linguistic pre-processing, extracts entities and their relationships, and assigns identifiers
  • Uses task-specific NLP modules for particular domain application
  • Modular design for scalability and manageability
  • Adaptable across wide variety of domains such as legal, financial, intellectual property and life sciences

Potential market application in:

  • Recruitment
  • Patents and Intellectual Property
  • Legal, Tax and Regulatory
  • Financial Services
  • Healthcare
  • Games Industry
  • Corporate Compliance
  • Engineering and Manufacturing
  • Science and Social Science
  • Intelligence and Counter-terrorism
  • Law Enforcement
  • Life Sciences

Benefits of TXM technologies

  • Centralises information and converts to a single electronic format – improves document handling and selection
  • More efficient research: reduced time and greater productivity
  • Greater accuracy with reduced human error
  • Reduces task tedium and frustration for the end user
  • Fewer items of importance are missed
  • Valuable, but otherwise hidden, items are discovered
  • Output quality is increased

What we're looking for

If you are interested in these assets or require further information please contact David Middleton or Stephen Moore at or