FOSDEM'23 HPC, Big Data, and Data Science Devroom

Sunday 5 February 2023, Brussels, Belgium (in-person)

Overview

Welcome to the 8th edition of the HPC, Big Data and Data Science devroom, co-located with FOSDEM 2023. FOSDEM is an annual conference about free and open source software, attended by over 8,000 developers and open-source enthusiasts from all over the world. This devroom is organised by representatives from the HPC and Big Data communities, who are joining forces to bring both communities together.

  • High Performance Computing (HPC) and Big Data are two important approaches to scientific computing. HPC typically deals with smaller, highly structured data sets and huge amounts of computation while Big Data, not surprisingly, deals with gigantic, unstructured data sets or data streams, usually processed with the help of distributed systems. When the Big Data trend unlocked access to an unprecedented amount of data, Data Science emerged to tackle the problem of creating processes and approaches to extracting knowledge or insights from these data sets. Machine learning and predictive analytics algorithms have joined the family of more traditional HPC algorithms and are pushing the requirements of cluster and data scalability.

  • Free and Open Source communities have been the foundation of the HPC and Big Data communities for some time. In the HPC community, it should be no surprise that, according to the Top500 supercomputers list, 100% of the supercomputers in the world run Linux. On the Big Data side, the Apache Big Data ecosystem (e.g. Apache Hadoop/Flink/Spark/Kafka) received a tremendous amount of Open Source contributions from a wide range of organizations coming together under the Apache Software Foundation.

  • Our goal is to bring the communities together, share expertise, learn how we can benefit from each other’s work and foster further joint research and collaboration. We welcome talks about Free and Open Source solutions to the challenges presented by large scale computing, data management and data analysis.

The HPC, Big Data, and Data Science devroom will take place on Sun 5 February 2023.

FOSDEM’23 will be a hybrid event, with the HPC, Big Data, and Data Science taking place physically on Sunday 5 February 2023.

Join us to enjoy a variety of talks, demos and interesting discussions on open-source HPC, Big Data and Data Science.

Sounds interesting? Submit your talk proposal below and see you in Brussels!

Topics

Topics of interest include, but are not limited to:

  • Architecture and design of High Performance Computing (HPC) and Big Data systems
  • Architecture and design of Extract, Transform and Load (ETL) and data acquisition pipelines
  • Data security and governance
  • Tools and technologies related to HPC and computational science, for example:
    • Multithreading (OpenMP, etc.)
    • Distributed computing (MPI, etc.)
    • GPGPU computing (OpenCL, OpenACC, etc.)
    • Parallel filesystems and storage
    • Large-scale performance analysis and debugging
  • Computational paradigms for Big Data systems
    • MapReduce engines
    • Streaming engines
    • SQL engines
    • Dataflow engines
  • Emerging hardware trends of large scale clusters
    • Large scale memory pooling
    • High-speed interconnects
    • ARM cluster architecture
  • System administration of HPC and Big Data clusters
  • User support tools
  • Machine learning libraries and tools
  • Scientific software applications, tools and libraries (across all scientific domains)
  • Big Data platforms, extensions to existing systems, libraries, APIs
  • Experience reports on using Big Data systems, for example:
    • Large-scale deployments
    • Development and configuration issues
    • Tuning and performance tips and lessons learned
  • Interesting Big Data use-cases and applications
  • Comparative analysis of existing systems, evaluation results, performance studies
  • Interdisciplinary HPC/Big Data use-cases, for example:
    • Applications using both HPC and Big Data technologies
    • Integration issues
    • Open research problems on the convergence of HPC and Big Data
    • Running MPI jobs on Big Data clusters and vice-versa
Submission

We invite presenters to submit talk proposals to present high-quality work with sufficient background material to be clear to the HPC, Big Data, and/or Data Science communities. Talk proposals should be submitted through the FOSDEM Pentabarf server. Submissions must include:

  • Abstract (plain text, couple of paragraps)
  • Session type
  • Session length
  • Expected prior knowledge / intended audience
  • Speaker bio
  • Links to code / slides / material for the talk (optional)
  • Links to previous talks by the speaker

Our intention is to have a series of talks of about 20 minutes each, with an additional 5-10 minutes for questions by attendees.

We would also like to note:

  • All accepted talks will be about (using) free and open source software.
    We highly discourage “marketing” talks.
  • All talks will be given in-person at FOSDEM in Brussels, on Sunday 5 February 2023.
    Please take this into account when submitting your talk.
    Remote presentations will unfortunately not be possible.

When submitting your talk in Pentabarf, make sure to select the ‘HPC, Big Data, and Data Science Devroom’ as the ‘Track’.

If you already have a Pentabarf account from a previous FOSDEM edition, please reuse it.
Create an account if, and only if, you don’t have one from a previous year. If you have any issues with Pentabarf, do not despair: contact hpc-bigdata-devroom [at] lists.fosdem.org .

Dates

Call for participation available: Wednesday 16 Nov 2022

Call for participation closes: Friday 16 Dec 2022

Devroom schedule available: Friday 30 Dec 2022

Devroom date: Sun 5 Feb 2023

If you would like to create an associated event for the devroom, please fork the page and send a pull request.

Organizers

Organizers

Devroom volunteers:

Please, take a moment to read the FOSDEM Code of Conduct.