Scalable Visual Data Analytics with Orange Data Mining Toolbox

Project Orange Data Mining

Blaž Zupan (University of Ljubljana)

Funding Cycle 4

Proposal Summary

To refactor Orange Data Mining toolbox to include the latest Python libraries for parallel, server-based data analysis, allowing it to scale to large biomedical datasets.


Orange Data Mining

Workflow-building tools like Orange Data Mining toolbox democratize data science by exposing an intuitive interface while hiding complex underlying mechanics. The tool owes its success to the Python ecosystem, particularly to NumPy; its array is the backbone of the Orange Table, the data structure used by Orange components in the graphical user interface. This proposal aims to refactor Orange’s ecosystem with Dask, Python’s scalable data analytics engine. Refactored Orange will retain the simplicity and its intuitive user interface but revamp the data infrastructure under the hood to democratize big data analytics.

Key Personnel

Blaž Zupan
Janez Demšar
Ajda Pretnar
Marko Toplak
Vesna Tanko
Aleš Erjavec
Rafael Irgolič