使用sktime内置函数加载acsf1数据集。 Docs(链接到外部网站。)
7分绘制每个类的第一个时间序列(.iloc [0]),并用其指定的类名标记每个图。这些模式应与在acsf1详细网页上可以找到的模式匹配。
This assignment will cover concepts of Time Series Classification
Students must submit TWO documents 1) .ipynb file 2) either .pdf or .html file created from the notebook.
Label each question and sub question with a markdown cell. Answer conceptual questions in a markdown cell. Answers that are commented in code blocks will not be considered for credit.
One word or one sentence answers will not receive full credit.
Load the acsf1 dataset using sktime built in functions. Docs (链接到外部网站。)
Create X_train, y_train and X_test, y_test
Part 1: 35 pts Understanding the dataset
Locate the dataset description within the full UCR data repository. Link (链接到外部网站。)
7pts Give a verbal description of the dataset from information on the acsf1 detailed webpage, not the summary repository page.
7pts There are 1460 time steps in each observation. Use len() to display this for any observation in the X_train
7pts Return the counts of classes in y_train
7pts Plot the first time series for each class(.iloc[0]), label each plot with its specified class name. The patterns should match what can be found on the acsf1 detailed webpage.
7pts Each observation is 10 seconds apart. Describe what the plots show for classes 3, 8 and 9. Give some intuition about what appliance each of these three classes might represent.
Part 2: 15 pts Description of Time Series Classification models
5 pts Select one classification model type. Describe how the model works. Why would each be a good or bad fit for this type of data?
5 pts Select a second classification model type. Describe how the model works. Why would each be a good or bad fit for this type of data?
5 pts Select third classification model type. Describe how the model works. Why would each be a good or bad fit for this type of data?
Part 3: 50 pts Select one method. Model and examine results
10 pts Select only one method. Fit your model. Feel free to adjust parameters or try a grid search (optional)
10 pts Return the accuracy score of the train set and test set (suggestion to use .score()). Print the confusion matrix and classification report of the test set.
10 pts Discuss the precision score for class 8. Support this with your visual opinion from plots in 1D as well as the confusion matrix.
10 pts Discuss the recall score for class 5. Support this with your visual opinion from plots in 1D as well as the confusion matrix.
10 pts Which metric do you feel is the most important in the following business case: You work for ComEd, a local electricity supplier. You head a department that uses analytics to plan electrical supply for Chicago’s power grid. Assume that your department budgets for a certain amount of electrical supply at a fixed low rate. If the total demand in Chicago stays within the purchased supply levels, your department is performing. If the demand breaches this supply level, the company is penalized and the rate for your supply multiplies by 100x, destroying your department’s performance. If you had to build your forecast model to classify patterns of high electrical usage (appliances, air conditioning, water heating) vs low electrical usage (lighting, tv, phone chargers) which metric (precision or recall) would you use?