Menu

Data Types and I/O Operations

Relevant source files

This document covers OpenDAL's core data types and I/O operation APIs, including the high-level public types used by applications, the low-level raw I/O abstractions in the oio module, and the bridge between them. This page focuses on the concrete types and operations rather than the abstract interfaces.

For information about the high-level Operator API that orchestrates these types, see Operator Interface. For details about how storage services implement the underlying abstractions, see Access Trait and Service Implementation.

Core Data Types

OpenDAL provides several fundamental data types that represent data and metadata in storage operations:

Core Data Type Hierarchy

Buffer Type

Buffer is OpenDAL's primary data container, wrapping bytes::Bytes to provide a unified interface for data handling across the system. It supports conversion from various Rust types and implements efficient zero-copy operations.

Metadata and Entry Types

Metadata contains file attributes like size, modification time, content type, and ETags. Entry combines a path with its metadata, used primarily in directory listings. EntryMode distinguishes between files and directories.

Reader, Writer, and Lister Types

These represent active I/O operations: Reader for streaming reads, Writer for streaming writes, and Lister for directory traversal. Each supports both synchronous and asynchronous operation modes.

Sources: core/src/types/mod.rs21-47 core/src/types/buffer.rs core/src/types/metadata.rs core/src/types/entry.rs core/src/types/read.rs core/src/types/write.rs core/src/types/list.rs

High-Level I/O Operations

OpenDAL provides three patterns for each I/O operation: simple, builder-pattern, and options-based.

I/O Operation Patterns

Simple Operations

Methods like read(), write(), stat() provide direct access with default options. These are one-shot operations that read entire files into memory or write complete data sets.

Builder Pattern Operations

Methods like read_with(), write_with() return future builders that allow chaining options before execution. For example:

Options-Based Operations

Methods like read_options(), write_options() accept structured option types, providing the same functionality with a more explicit API.

Streaming Operations

Methods like reader(), writer() return streaming interfaces for incremental I/O, supporting large files and real-time data processing.

Sources: core/src/types/operator/operator.rs444-580 core/src/types/operator/operator.rs829-985 core/src/types/operator/operator_futures.rs93-415

Operation Options and Arguments

Options Types

High-level option types like ReadOptions and WriteOptions provide user-friendly configuration with validation and defaults. These include features like byte ranges, conditional headers, content metadata, and concurrency settings.

Raw Args Types

Lower-level operation argument types like OpRead and OpWrite represent the validated, normalized parameters passed to storage service implementations. The conversion handles capability checking and parameter transformation.

Capability Validation

The Capability type defines which features each storage service supports, ensuring options are validated before being passed to the underlying implementation.

Sources: core/src/types/options.rs27-350 core/src/raw/ops.rs309-434 core/src/raw/ops.rs544-665 core/src/types/capability.rs20-226

Low-Level I/O Module (oio)

The oio (OpenDAL I/O) module provides raw I/O abstractions that storage services implement. It handles the complex details of buffering, chunking, and concurrent operations.

oio Module Architecture

Write Strategy Pattern

OpenDAL supports three write strategies based on storage service capabilities:

  • BlockWrite: For services supporting block-based uploads (Azure Blob Storage)
  • MultipartWrite: For services supporting multipart uploads (AWS S3)
  • AppendWrite: For services supporting append operations (HDFS)

Each strategy implements the oio::Write trait but handles chunking, uploading, and completion differently.

Concurrent Task Management

The ConcurrentTasks type manages parallel upload/download operations, handling retry logic, error recovery, and resource cleanup across multiple concurrent streams.

Sources: core/src/raw/oio/write/block_write.rs29-94 core/src/raw/oio/write/multipart_write.rs27-104 core/src/raw/oio/write/append_write.rs23-52 core/src/raw/futures_util.rs105-305

Write Operation Flow

Write Context Creation

WriteContext determines the appropriate write strategy based on service capabilities and user options. It creates the optimal oio::Writer implementation and configures chunking and concurrency parameters.

Buffering and Chunking

Writers buffer incoming data until chunks reach optimal sizes. The chunk size is determined by service limits (e.g., S3's 5MB minimum) and user preferences.

Concurrent Upload Management

For large files, multiple chunks are uploaded concurrently using ConcurrentTasks. Failed uploads are automatically retried with exponential backoff.

Completion and Cleanup

The close operation ensures all chunks are uploaded and calls the service's completion API (e.g., S3's CompleteMultipartUpload) to finalize the object.

Sources: core/src/types/write.rs core/src/raw/oio/write/multipart_write.rs205-306 core/src/raw/oio/write/block_write.rs170-242

Read Operation Flow

Read Context and Caching

ReadContext configures read behavior including concurrent chunk fetching, gap merging, and range optimization. Readers implement intelligent caching to minimize redundant requests.

Range Request Optimization

When multiple small ranges are requested with small gaps between them, the reader merges them into larger requests to reduce API calls, discarding unwanted data locally.

Concurrent Reading

Large range requests are split into multiple concurrent chunks, improving throughput for high-latency storage services.

Sources: core/src/types/read.rs core/src/types/operator/operator_futures.rs491-521

Data Type Conversions and Integration

OpenDAL provides extensive conversion capabilities between its types and standard Rust ecosystem types.

Buffer Integration

From TypeTo BufferNotes
Vec<u8>Buffer::from()Zero-copy when possible
&[u8]Buffer::from()Always copies
StringBuffer::from()UTF-8 bytes
bytes::BytesBuffer::from()Zero-copy wrapper

Stream Integration

TypeStream SupportNotes
Readerimpl Stream<Item=Result<Bytes>>Async iteration
Writerimpl Sink<Bytes>Streaming writes
Listerimpl Stream<Item=Result<Entry>>Directory traversal

Async Runtime Integration

Futures Ecosystem

OpenDAL types integrate seamlessly with the Rust async ecosystem, implementing standard traits like Stream, Sink, and AsyncWrite for interoperability with other async libraries.

Executor Abstraction

The Executor type abstracts over different async runtimes (Tokio, async-std, etc.), allowing OpenDAL to work in various execution environments while maintaining optimal performance.

Sources: core/src/types/read.rs core/src/types/write.rs core/src/types/execute/executor.rs29-86 core/src/raw/futures_util.rs60-145