Back to EOSS Proposals List

Back to All Open Science Grantees

Scalable Visual Data Analytics with Orange Data Mining Toolbox


Project Orange Data Mining
Funding Cycle 4

Proposal Summary

To refactor Orange Data Mining toolbox to include the latest Python libraries for parallel, server-based data analysis, allowing it to scale to large biomedical datasets.


Project

Orange Data Mining

Workflow-building tools like Orange Data Mining toolbox democratize data science by exposing an intuitive interface while hiding complex underlying mechanics. The tool owes its success to the Python ecosystem, particularly to NumPy; its array is the backbone of the Orange Table, the data structure used by Orange components in the graphical user interface. This proposal aims to refactor Orange’s ecosystem with Dask, Python’s scalable data analytics engine. Refactored Orange will retain the simplicity and its intuitive user interface but revamp the data infrastructure under the hood to democratize big data analytics.


Key Personnel

Blaž Zupan
Janez Demšar
Ajda Pretnar
Marko Toplak
Vesna Tanko
Aleš Erjavec
Rafael Irgolič