Students who have problems accessing MOC (e.g. cannot access even via VPN or connection is too slow) should contact me ASAP.
1. TASK I: OpenTracing tutorial (credits: 20/100)
The first task of this assignment is to familiarize yourself with Jaeger. OpenTracing has an excellent tutorial for beginners.
Hint #1: The tutorial suggests that you use Jaeger within a Docker container. To do so, you will first need to install Docker Desktop.
Hint #2: You may also want to have a look at this tutorial, which discusses some additional Jaeger features, such as RPC tracing and baggages (see also Lecture 18).
2. TASK II: Restructure your operator library (credits: 20/100)
The second task is to restructure your operator library from Assignments #1 and #2 so that it supports push-based data-parallel execution. To do so, you need to work on the Ray branch from Assignment #1 and implement the following changes:
A. Replace get_next() with the method execute(tuples: List[ATuple]) -> bool that applies the logic of the operator to a list of tuples (provided by the previous operator in the plan) and pushes the output batch of tuples to the next operator.
B. Modify the constructor of each relational operator so that it accepts one or more (data-parallel) instances of the same input logical operator.
C. Implement a Sink operator that simply storse the output of the query in memory. Sink should have a method get_result() -> List[ATuple] that returns the query output.
In the end, you should be able to execute the recommendation query from Assignment #1 by calling the execute() method on each data-parallel instance of the Scan operator to get the recommended movie id. Assuming two instances, this will look as follows:
# Your implementation of the recommendation query scan_1 = Scan.remote(…) scan_2 = Scan.remote(…) select_1 = Select.remote(…) …
sink = Sink(…) …
# Start query execution scan_1.execute.remote() scan_2.execute.remote()
movie_id = ray.get(sink.get_result()) # Blocking call
Hint: The execute() method of the Scan operator should read lines from the input file within a loop and push the corresponding batches to the next operator in the plan until the file is exhausted.
3. TASK III: Trace your operator library on Ray (credits: 40/100)
The third task is to instrument your operator library using Jaeger and generate traces for an execution of the recommendation query using one instance per operator. The result should be a typical span tree, like those you generated in Task I. Each span in the tree should represent a call of the execute()method on a relational operator. The span hierarchy (cf. Lecture 17) must reflect the sequence of execute() calls during query evaluation.
Explore the span tree using Jaeger’s web-based UI and report the top-2 most time-consuming operators in your implementation. Push the related screenshot(s) to your Gitlab repository.
4. TASK IV: Distributed tracing on MOC (credits: 20/100)
The fourth and final task is to deploy your instrumented library on MOC and perform Task III in a distributed setting. For this task, you have additional flexibility to decide the particular cluster configuration, actor placement, type of VMs, etc.
A. As a first step, you need to create a small cluster of VMs on MOC. You are free to choose between Windows and Linux machines (we recommend the latter). Make sure you activate a
Floating IP (“Networks Floating IPs”) for at least one VM so that you can access the cluster → from the outside world. To do so, you will also need to allow SSH connections (“Networks → Security Groups”) and upload your public SSH key (“Key Pairs”). In the end, you should be able
to access your cluster via SSH using the Floating IP.
B. The next step is to install and deploy Ray on your cluster. See here for more information. The easiest way to check if Ray is deployed successfully is to search for Ray processes running on your cluster VMs. Before moving to the next step, make sure you can run a simple Ray application (e.g. calling a method on a remote actor).
C. The third step is to deploy your implementation for Task III on the Ray cluster and trace an execution of the recommendation query with the following parallelism: 8 Scan instances (4 per input file), 4 Select instances (one for each Scan instance reading from a partition of Friends.txt), 2 Join instances, 2 Group-by instances, 1 Order-by, 1 Project, and 1 Sink instance. Each instance should be a Ray actor and each actor can be placed on any VM. Make sure parts of your recommendation pipeline run on different VMs but do not spend time optimizing the actor placement. The goal of this task is to profile a placement, understand where time is spent, and identify bottlenecks.
D. Push the related Jaeger screenshot(s) to your Gitlab repositoty.
Hint: Make sure each VM has more vCPUs than the number of Ray actors you run on it. For example, if you plan to run three operators/actors on a VM, make sure that it has at least 4 vCPUs.
EasyDue™ 支持PayPal, AliPay, WechatPay, Taobao等各种付款方式!
E-mail: email@example.com 微信:easydue