Learn more about LinuxCon + ContainerCon + CloudOpen China, happening June 19-20. 

Customize your schedule by experience level and/or presentation language: Refer to the “Filter by Type” list on the right to find a session based on topic and/or experience level. Presentation Language - Sessions are categorized as [C] Chinese, [C,E] Chinese with English Slides or [E] English at the end of each talk title.
Back To Schedule
Tuesday, June 20 • 11:40 - 12:10
Container-based Machine Learning Platform at Scale [C] - Kai Zhang, Alibaba Cloud

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Running large-scale machine learning is frustrating. It requires massive data and computational power. Its model is often built with frameworks like Tensorflow, Caffe, SparkML. Each fairly requires big efforts to manage libraries and dependencies. It commonly takes a long procedure from preparing data, training model to online prediction. In Aliyun, Kai's team builds elastic machine learning platform based on Docker technology to resolve above difficulties. It provides services to support environment portability and repeatability, CPU/GPU isolation, job scheduling, monitoring, load balancing and auto scaling.
In this topic, Kai introduces how they facilitate end-to-end lifecycle of large-scale machine learning workload with GPU acceleration through container management and orchestration. Demo will show how people can start to train his first neural network model in minutes.

avatar for Kai Zhang

Kai Zhang

Staff Engineer, Alibaba
Kai Zhang, is now a staff engineer of Alibaba Cloud. He's worked on container service product and enterprise solution development for 3 years. Before that, he worked in deep learning platform, cloud computing, distributed system and SOA area over 10 years. Recently, he is exploring... Read More →

Tuesday June 20, 2017 11:40 - 12:10 HKT
Room 309B
  Cloud Native & Containers, Developer