
Posts by Gareth Clews


Optimus – A natural language processing pipeline for turning free-text lists into hierarchical datasets
Many datasets contain variables that consist of short free-text descriptions of items or products. This technical report describes how we developed and implemented a natural language processing (NLP) pipeline that produces a well structured and hierarchical dataset.
Read more on Optimus – A natural language processing pipeline for turning free-text lists into hierarchical datasets
optimus – turning free-text lists into hierarchical datasets
Project summary Many datasets contain variables that have been collected as free-text in an uncontrolled way. In the case where…
Read more on optimus – turning free-text lists into hierarchical datasets