Software

This page provides pointers to software developed in the course of R&D activities of the Information Engineering Research Unit (IE).

IE is commited to open source, and the majority of the software produced is under LGPL license. Not all the software we produced is actively maintained, since this depends on the ongoing R&D activities of the group.

= Software libraries (active) =

Software libraries listed here are complete frameworks for particular purposes. Only those that are currently active (or we plan to continue with) are in this section.

ehr2ont
ehr2ont is a generic Java library aimed at the translation of OpenEHR archetypes to ontology languages.

OpenEHR archetypes are reusable definitions for fragments of clinical information. Translating them to ontology languages enables the use of inference and the mapping to existing formal ontologies in the biomedical and clinical domains.

ehr2ont currently supports (a part of) OpenEHR ADL version 1.2. and generates and uses OWL.

The project is hosted in the ehr2ont google code repository.

ontometrics
ontometrics is a flexible and extensible Java library implementing some common ontology metrics.

The project is hosted in the ontometrics sourceforge code repository.

fuzzy-lib
The fuzzy-lib is a Java framework for computing with fuzzy set theory constructs, including fuzzy sets, fuzzy arithmetics and more.

The project is hosted in the fuzzy-lib sourceforge code repository.

open-parametrics
open-parametrics is a Java framework providing an API for dealing with parametric models, and integrating techniques and algorithms for the generation, calibration and evaluation of parametric models. It currently collects the code created in the last years in the course of research in parametrics for Software Engineering, but it is open and generic enough to include parametrics for other domains and additional models and techniques.

The project is hosted at google code.

= Programs for experiments and studies =

This section lists extensions to tools such as Weka or small software programs developed ad hoc for particular research needs.

SimpleKMeans-for-MLST
This is an implementation of a variant of the  algorithm provided in Weka 3.4. that uses global alignment of sequences of housekeeping genes to drive the clustering process. It has been designed for bacterial isolate clustering using data from MLST databases. However, it might be used for other similar purposes.

The source code is provided in the links below.

Sequence alignment requires BioJava 1.6.

Additionally, you can download a small utility program called  to get data from a MLST database into your local relational store (you would need to add it to your Web Service project for example in Netbeans to generate the stubs required for the calls acording to the WSDL file). Also, you can download an associated small utility called  that gets data from that relational schema and converts it to WEKA ARFF format. This was used to get data for testing of the modified k-means algorithm.

GASUD
GASUD (Genetic Algorithm for SUbgroup Discovery) is a Weka 3.6 extension algorithm that uses Genetic Algorithms to generate rules for subgroup discovery. Instead of finding rules for all classes, this algorithm focuses on one class only, generally the minority class when datasets are unbalanced. The source code is provided below.