Predictive Models Evaluation & Inspection in Scikit-learn
To sustain the scikit-learn library through maintenance, improvement, and extension, notably in the domain of predictive model evaluation and inspection. Note: This proposal was funded by Wellcome Trust as part of our co-funded EOSS Cycle 6.
Project Lead: Guillaume Lemaitre (Inria Foundation)
Reproducibility in Bioinformatics by Sustaining Bioconda Development
To establish teaching material, improve documentation, and minimize maintenance effort of the Bioconda project by extending automation of code review, testing, and building.
Project Lead: Johannes Köster (University of Duisburg-Essen, Bioconda Core Team)
Revitalizing NetworkX for Complex Network Analysis
To meet the needs of the scientific community over the next decade, this team will revitalize NetworkX — the fundamental network analysis tool in Python — by growing its developer community, refactoring code, improving performance, and making a major release.
Project Lead: Stefan van der Walt (University of California, Berkeley)
Salmon: Improving RNA-seq Quantification & Building an Inclusive Community
To advance support and development of the open source Salmon and Alevin software for gene expression quantification of single-cell and bulk RNA-seq.
Project Lead: Carl Kingsford (Ocean Genomics)
Scalable Storage of Tensor Data for Scientific Computing
To establish Zarr as a foundation for scientific data storage, with clear data format and protocol specifications, implementations in multiple programming languages, and a community process for evolving to support new scientific applications.
Project Lead: Ryan Williams (Mount Sinai School of Medicine)
Scalable Visual Data Analytics with Orange Data Mining Toolbox
To refactor Orange Data Mining toolbox to include the latest Python libraries for parallel, server-based data analysis, allowing it to scale to large biomedical datasets.
Project Lead: Blaž Zupan (University of Ljubljana)