Cloudflare R2 and MosaicML: Train LLMs on Any Compute with Zero Switching Costs

Together, Cloudflare and MosaicML give users the freedom to train LLMs on any compute, anywhere in the world, for faster, cheaper training runs without vendor lock-in.

Read the complete blog post to learn more!

Building generative AI models requires massive compute AND data storage infrastructure. Training huge datasets means that terabytes of data must be read in parallel by thousands of processes. In addition, model checkpoints need to be saved frequently throughout a training run, and these checkpoints alone can be hundreds of gigabytes in size.

In a recent blog post, Cloudflare and MosaicML engineers discuss how their tools work together to address these challenges. MosaicML’s open source StreamingDataset and Composer libraries let users easily stream in training data and read/write model checkpoints back to Cloudflare R2. And thanks to R2’s zero-egress pricing and MosaicML’s cloud-agnostic platform, users can start/stop/move/resize jobs in response to GPU availability and prices across compute providers, without paying any data transfer fees. By eliminating egress fees, R2’s storage is an exceptionally cost-effective complement to MosaicML training, providing maximum autonomy and control.

February 9, 2023

MosaicML StreamingDataset: Fast, Accurate Streaming of Training Data from Cloud Storage

July 18, 2023

Announcing MPT-7B-8K: 8K Context Length for Document Understanding

June 22, 2023

Cloudflare R2 and MosaicML: Train LLMs on Any Compute with Zero Switching Costs

Related posts

MosaicML StreamingDataset: Fast, Accurate Streaming of Training Data from Cloud Storage

Announcing MPT-7B-8K: 8K Context Length for Document Understanding

MPT-30B: Raising the bar for open-source foundation models