Introduction¶
Overview¶
DataCrumbs is a comprehensive I/O profiling and tracing tool designed for high-performance computing (HPC) environments. It leverages eBPF (extended Berkeley Packet Filter) technology to provide low-overhead, real-time monitoring and analysis of application I/O behavior at scale.
Key Features¶
Low-Overhead Profiling¶
DataCrumbs uses eBPF to achieve minimal performance overhead while capturing detailed I/O traces. Unlike traditional profiling tools that may introduce significant overhead, DataCrumbs operates in kernel space, minimizing interference with application execution.
Scalable Architecture¶
Designed for large-scale HPC systems, DataCrumbs can efficiently trace I/O operations across distributed applications running on thousands of nodes. The tool supports multiple job schedulers including:
FLUX
SLURM
OpenMPI
Standalone execution
Multi-Layer Tracing¶
DataCrumbs captures I/O operations at multiple layers:
System Calls: Low-level I/O operations (open, read, write, close, etc.)
POSIX I/O: Standard C library I/O functions
MPI-IO: Parallel I/O operations in MPI applications
HDF5: High-level I/O library operations
Custom Functions: User-defined function tracing through configuration
Flexible Deployment¶
DataCrumbs supports multiple deployment modes:
Server Mode: Long-running server process for continuous monitoring
Service Mode: Systemd service integration for managed execution
Wrapper Mode: Direct application wrapping for single executions
Composable Mode: Custom probe configurations for specific use cases
Real-Time Analysis¶
The tool provides:
DFTracer format output (
.pfw.gz) for efficient trace storage and analysisEvent aggregation and correlation across processes
Trace compression for efficient storage
Real-time event processing with ring buffers
Use Cases¶
DataCrumbs is particularly useful for:
Performance Analysis: Understanding I/O bottlenecks in HPC applications
I/O Pattern Discovery: Identifying access patterns and optimization opportunities
Debugging: Tracing file access patterns to diagnose issues
System Monitoring: Continuous monitoring of I/O behavior in production environments
Research: Studying I/O characteristics of scientific applications
Architecture¶
DataCrumbs consists of several key components:
eBPF Probes: Kernel-space programs that capture events with minimal overhead
Explorer: Discovers available probes and functions in libraries
Generator: Creates custom eBPF programs based on configuration
Server: User-space daemon that collects and processes events
Client Library: Injected into applications for tracking
Writer: Formats and outputs trace data in DFTracer format (
.pfw.gz)
Target Applications¶
DataCrumbs is designed for profiling:
MPI applications
Parallel I/O workloads
Scientific computing applications
Data-intensive applications
Any application using POSIX, HDF5, MPI-IO, or custom I/O libraries
License¶
DataCrumbs is developed at Lawrence Livermore National Laboratory and released under an open-source license. See the LICENSE file in the repository for details.
Citation¶
If you use DataCrumbs in your research, please cite the appropriate publication. See CITATION.cff in the repository for citation information.