Data Management and Visualization for Benchmarking Deep Learning Training Systems

Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

Abstract

Evaluating hardware for deep learning is challenging. The models can take days or more to run, the datasets are generally larger than what fits into memory, and the models are sensitive to interference. Scaling this up to a large amount of experiments and keeping track of both software and hardware metrics thus poses real difficulties as these problems are exacerbated by sheer experimental data volume. This paper explores some of the data management and exploration difficulties when working on machine learning systems research. We introduce our solution in the form of an open-source framework built on top of a machine learning lifecycle platform. Additionally, we introduce a web environment for visualizing and exploring experimental data.
Original languageEnglish
Title of host publicationProceedings of the Seventh Workshop on Data Management for End-to-End Machine Learning, DEEM 2023, Seattle, WA, USA, 18 June 2023
Number of pages5
PublisherAssociation for Computing Machinery
Publication date2023
Pages1:1-1:5
DOIs
Publication statusPublished - 2023

Fingerprint

Dive into the research topics of 'Data Management and Visualization for Benchmarking Deep Learning Training Systems'. Together they form a unique fingerprint.

Cite this