Loading…
This event has ended. View the official site or create your own event → Check it out
This event has ended. Create your own
Learn more about LinuxCon + ContainerCon + CloudOpen China, happening June 19-20. 

Customize your schedule by experience level and/or presentation language: Refer to the “Filter by Type” list on the right to find a session based on topic and/or experience level. Presentation Language - Sessions are categorized as [C] Chinese, [C,E] Chinese with English Slides or [E] English at the end of each talk title.
View analytic
Tuesday, June 20 • 14:55 - 15:25
Challenge of HPC Data Center: When HPC Meets the ML/DL and Container [C} - Yong Feng, IBM Canada Ltd.

Sign up or log in to save this to your schedule and see who's attending!

Feedback form is now closed.
With the trend of the AI technology, the HPC data centers are facing the challenge of developing and running ML/DL workloads on their systems with container run time environment.
The existing HPC job schedulers are not usually chosen to run ML/DL stack due to the gap of supporting long running services. However, the popular container platform used to run ML/DL stack cannot meet the requirement of traditional HPC workload due to the lack of non-docker support and scheduling policy such as back-fill, cpu binding and so on.
This session introduces a Kubernetes+HPC Job Scheduler+Tensorflow based architecture of HPC data center to run MPI job and DL stack together, isolated by container and dynamically share resource between each other with a demo. The session explains the technical issues met during development and how they are resolved by enhancing those open source components.


Speakers
YF

Yong Feng

Senior Product Architect, IBM Canada Ltd.
Yong Feng is a Senior Product Architect in IBM Spectrum Computing Canada. He has more than 10 years experience on resource scheduling and management in the areas of HPC, virtual machine management, analytics/big data platforms and container cloud. Yong Feng is currently leading a... Read More →


Tuesday June 20, 2017 14:55 - 15:25
Room 309B

Attendees (9)