这个作业是来自澳洲的关于调查谷歌数据中心以及初创公司的竞争优势,并撰写成报告的Essay代写
COMP3100 Assignment 2: Semester 1 – 2020
太极云调度程序
使用说明
“ Google在世界各地拥有和运营数据中心,有助于使互联网保持24/7的嗡嗡作响。了解我们对创新的不懈关注如何使我们的数据成为现实集中一些高性能,安全,可靠和高效的数据中心在世界上” [1]。
图1:Google数据中心[1]。
“ Google云计划程序是一种完全托管的企业级cron作业计划程序。它允许您调度几乎任何作业,包括批处理,大数据作业,云基础架构操作等。您可以自动化所有操作,包括重试如遇故障,减少人工劳动和干预。 Cloud Scheduler甚至可以充当单窗格玻璃,使您可以从一个窗口管理所有自动化任务地点。” [2]
Taiji Cloud是一家初创公司,希望在数据方面与Google竞争中心市场。对于公司而言,最重要的事情是提出具有竞争力的作业调度程序算法,以从目前的独角兽公司(如Google)获得市场份额。
初创公司的CTO知道您一直在COMP3100中执行作业调度程序算法,并要求您详细解释阶段2中的三种算法。
(问题1-3)。前三个问题基于太极云现有数据中心具有多种资源类型。资源将由几个工作使用
提交到数据中心。数据中心遵循称为FirstFit,BestFit,WorstFit的三个策略,将作业分配给资源。以下是有关的信息作业(表1)和资源(表2)。
表1:工作
作业ID到达时间预计执行时间 (时隙)
需要CPU核心 需要记忆 需要 磁盘空间
0 2 5 3 2 2
1 3 7 2 3 5
2 5 4 4 1 3
3 6 8 3 2 4
4 10 9 1 5 2
5 23 3 2 3 3
6 34 10 2 3 2
7 43 12 3 2 2
8 44 3 4 4 1
9 56 15 3 1 3
10 57 5 2 2 4
11 89 1 1 4 4
12 90 8 2 3 3
13 94 7 2 2 2
14 95 2 4 3 5
15 95 6 4 3 4
表2:服务器资源资源类型开机时间时隙率
CPU核心内存磁盘空间限制
10叙利亚镑1 2 10 15 3
MEL 15 $ 2 4 15 20 4
GLC 5 $ 3 8 20 25 5
假设:
•每个服务器仅具有以下状态之一:引导(BOT),空闲(IDL),和主动(ATV)。
•如果服务器处于运行状态,则在服务器可以运行作业时考虑可用的服务器时间/活动状态(ATV)。
•成本是根据作业使用的时隙数量计算的。如果是服务器在任意时间没有任何工作,成本为零。
•当多个作业在同一服务器上运行时,成本基于最大估计作业的时间段。
•数据中心考虑以下指标来评估性能:
表3:指标
指标描述
周转时间完成一项工作所花费的时间。在换句话说,根据完成时间和到达时间之间的差来计算作业周转时间。
成本提交给每个服务器的作业的成本关于时隙速率。
利用率服务器在CPU方面的负载情况,内存和磁盘。
*请注意,在此分配中,对于活动的服务器,当前利用率或成本为考虑过的。每种服务器类型都有其自己的成本。
**为此,请提及主要步骤并简要说明原因。
对于每个指标,您必须考虑所有作业的最后完成时间,这意味着没有要执行的作业时(例如,ds-server中的模拟结束时间)。因此,每个服务器的使用期限从服务器就绪时间开始,直到
最后完成时间,包括活动时间和空闲时间。
第二行已完成作业0的FirstFit算法。
表4:计划决策
J_id J_avt S_typ
Ë
S_id S_stat
Ë
S_rea
dy
为什么*
*
Res J_完成
J_id_
等候
成本*利用率*
0 2 MEL 0 BOT 2 + 15= 17
脚步
3,4,5
4,15,2
0
不适用不适用$ 0 0%,0%,0%
Table 4 parameters have the following meanings:
J_id: Job Id S_id: Server Id Why: the reason
for the scheduling decisions
J_id_waiting: Waiting jobs on this server
J_avt: Job Arrival Time
S_state: Sever State Res: Current
server resource; core,
memory, disk
Cost: Accumulated
server cost for each type
over time
S_type: Server
Type
S_ready: Server
available time
J_finished: Finished jobs on this
server
Utilization: Current utilization of resources
The CTO also ask you to write some investigative reports on the fourth question. Each question has some instructions of what you shall write about as a starting point.
Reference:
[1] https://www.google.com.au/about/datacenters/
[2] https://cloud.google.com/scheduler
Exam questions are overthe page
Q1 FirstFit algorithm (20 marks)
Followed Firstfit algorithm stated in stage #2 project description, show how jobs in Table 1 are allocated to resources in Table 2.
• You MUST use Table 4 as the template to calculate the average metrics in Table 3 when all jobs are completed including:
Completion time, Average CPU/MEM/DISK utilization per each server type, Cost per each server Type, and Average turnaround time of jobs.
***Your understanding about the job allocation and metrics will be assessed.
Please focus on the logic behind each algorithm.***
Q2 BestFit algorithm (20 marks)
Followed BestFit algorithm stated in stage #2 project description, show how jobs in Table 1 are allocated to resources in Table 2.
• You MUST use Table 4 as the template to calculate the average metrics in Table 3 when all jobs are completed all jobs are completed including:
Completion time, Average CPU/MEM/DISK utilization per each server type, Cost per each server Type, and Average turnaround time of jobs.
***Your understanding about the job allocation and metrics will be assessed.
Please focus on the logic behind each algorithm.***
Q3 WorstFit algorithm (20 marks)
Followed WorstFit algorithm stated in stage #2 project description, show how jobs in Table 1 are allocated to resources in Table 2.
• You MUST use Table 4 as the template to calculate the average metrics in Table 3 when all jobs are completed all jobs are completed including:
Completion time, Average CPU/MEM/DISK utilization per each server type, Cost per each server Type, and Average turnaround time of jobs.
***Your understanding about the job allocation and metrics will be assessed.
Please focus on the logic behind each algorithm.***
Q4 Job Scheduler, Time Synchronisation, Fault
Tolerance and Transparency (40 marks)
How Distribution transparency is related to such job scheduler application?
Summarise here what distribution transparency is, and give an example for each transparency principle based on the prototype job scheduler application.
What Time Synchronisation mechanism will you recommend?
Summarise why time synchronisation is required for this job scheduler application.
Give your recommended time synchronisation mechanism. More importantly give justification.
What fault-tolerance mechanism will you recommend?
Summarise what specific fault model you are targeting for this prototype system, list different faults handling mechanism and give your recommended fault handling mechanism. More importantly give justification.