BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20210402T160101Z
LOCATION:Track 2
DTSTART;TZID=America/New_York:20201118T153000
DTEND;TZID=America/New_York:20201118T160000
UID:submissions.supercomputing.org_SC20_sess146_pap211@linklings.com
SUMMARY:Metis: Learning to Schedule Long-Running Applications in Shared Co
 ntainer Clusters at Scale
DESCRIPTION:Paper\n\nMetis: Learning to Schedule Long-Running Applications
  in Shared Container Clusters at Scale\n\nWang, Weng, Wang, Chen, Li\n\nOn
 line cloud services are deployed as long-running applications (LRAs) in co
 ntainers. Scheduling LRA containers is known to be difficult as they often
  have sophisticated resource interferences and I/O dependencies. Existing 
 schedulers rely on placement constraints and thus fall short in performanc
 e.\n\nIn this work, we present Metis, a general-purpose scheduler using de
 ep reinforcement learning (RL) techniques. This eliminates manual specific
 ation of placement constraints and offers concrete quantitative scheduling
  criteria. As directly training an RL model does not scale, we develop nov
 el hierarchical learning techniques that decompose a complex container pla
 cement problem into a hierarchy of subproblems with significantly reduced 
 state and action space. We have implemented Metis in Docker Swarm. EC2 dep
 loyment with real applications shows that compared with state-of-the-art s
 chedulers, Metis improves the request throughput by up to 61%, optimizes v
 arious scheduling objectives and easily scales to a large cluster where 3k
  containers run on over 700 machines.\n\nTag: Cloud and Distributed Comput
 ing, Containers, Machine Learning, Deep Learning and Artificial Intelligen
 ce, Resource Management and Scheduling\n\nRegistration Category: Tech Prog
 ram Reg Pass
END:VEVENT
END:VCALENDAR

