这个作业是在机器学习中探索的一个领域进行调查

 

Statistical Machine Learning STAT613 Spring 2020

Guidelines for Final Project

  1. You can do the report in teams of up to two people. Each member must submit the same report through Canvas.
  2. The final report must be between 8-10 pages long (1.5 spacing, one column, 1 inch or less on each side).
  3. The front page must have the title of the project and the name of the student(s) (the front page does not count toward the 8-10 pages).
  4. The report must contain two parts: a survey and an experimental (alternatively theoretical) section. Each counts 50% toward the final grade of the project.
  5. The first part of the project is a survey (with your own words and your own way of organizing topics) of one area you want to explore in machine learning (3-4 pages). It is important to emphasize that all the text here must be your own, you are not allowed to copy and paste from any other source. This is an important component of your project, do not omit it.
  6. The second part is the development of an idea you had on that same area (4-5 pages). Here you must propose some empirical or theoretical analysis or a study to validate an idea. For example:
  • You could compare the execution time of several machine learning techniques on various datasets.
  • You could implement a program (in any language you want) that tries to solve a particular problem of your interest (e.g., a module to avoid overfitting, a new idea to improve on a learning algorithm, etc.).
  • You could try a support vector machine, a neural network, a decision tree, or any other learning algorithm on a particular area of interest to you.
  • You could show a new learning bound on generalization performance that is tighter to previous bounds.

To gain extra credit on this second part of the report, you must make sure of the following:

A) You have to go beyond the simple use of an existing toolbox (e.g., WEKA, tensorflow, Matlab, RStudio, etc.). There must be some code development on your part. For example. You may want to do some modification on an existing algorithm to try to achieve better performance results, or you may need to develop code to process some data on a specific domain.

Clarification: You are more than welcome to use any existing machine learning toolbox. That is perfectly fine. Just make sure you go “beyond” just clicking bottoms on such toolboxes to get results. Some code development is needed at some point in your work.

B) There has to be some novelty in your approach. Perhaps you decide to tweak a neural network and try a new activation function, or you decide to implement a search technique using genetic algorithms, but at some point, your algorithm switches to particle-swarm optimization to be more complete and exhaustive. Make sure there is this novel component in your experimental design.

The last part of the project (1-page max.) should contain conclusions and references. Make sure conclusions are meaningful and clear. For the references, make sure you list at least five of them.