AI/ML Storage Workloads in Google Cloud

Поділитися
Вставка
  • Опубліковано 16 чер 2024
  • Sean Derrington from Google Cloud's storage group presents advancements in cloud storage, particularly for AI and ML workloads. Google Cloud has focused on optimizing storage solutions to support the unique requirements of AI and ML applications, such as the need for high throughput and low latency. Key innovations include the Anywhere Cache, which allows data to be cached close to GPU and TPU resources to accelerate training processes, and the parallel file system, which is based on Intel DAOS and is designed to handle ultra-low latency and high throughput. These advancements aim to provide flexible and scalable storage options that can adapt to various workloads and performance needs.
    Derrington also highlights the introduction of HyperDisk ML, a block storage offering that enables volumes of data to be accessible as read-only across thousands of hosts, further speeding up data loading for training. Furthermore, Google Cloud has introduced Cloud Storage FUSE with caching, which allows customers to mount a bucket as if it were a file system, reducing storage costs and improving training efficiency by eliminating the need for multiple data copies. These solutions are designed to decrease the time required for training epochs, thereby enhancing the overall efficiency of AI and ML workloads.
    In addition to AI and ML optimizations, Google Cloud has focused on providing robust storage solutions for other workloads, such as GKE and enterprise applications. Filestore offers various instance types-Basic, Zonal, and Regional-each catering to different performance, capacity, and availability needs. Filestore Multi-Share allows for the provisioning of small persistent volumes, scaling automatically as needed. HyperDisk also introduces storage pools, enabling the pooling of IOPS and capacity across multiple volumes, thus optimizing resource usage and cost. These storage solutions are designed to support both stateless and stateful workloads, ensuring high availability and seamless failover capabilities.
    Presented by Sean Derrington, Storage Group PM. Recorded live on the Google Cloud campus in Sunnyvale, California on June 13, 2024. Watch the entire presentation at techfieldday.com/appearance/g... or visit TechFieldDay.com/event/cfd20/ or g.co/cloud/fieldday2024 for more information.
  • Наука та технологія

КОМЕНТАРІ •