Nonlinear Data Analytics

Final Project


Description:

This is by far the largest component of the course. You will discover, explore, and attack a real world problem of your choosing. There are three types of projects you can work on, shown below in order of increasing difficulty:

  • (1) Application of existing algorithm to a new problem and potentially new data.
  • (2) Algorithmic work. Extend an existing algorithm or conceive a new one to solve some problem. This inherently includes the first option because you will need to test this new algorithm on data.
  • (3) Theoretical work. Create a new convergence bound on a learning algorithm. Show that at some limit one learning algorithm becomes another. Etc.

These also have increasing risk. For example, you cannot turn in a paper saying you worked on a convergence bound for months with no results. Option two has medium risk because part of the process of creating a new algorithm is creating baselines to improve upon. At any time during the course please feel free to come and discuss your problem and ask questions with the instructor or TA.

Requirements:

All of the requirements below must be satisfied in order to receive full credits for the project:

  • Partner:
    Maximum of 1 partner (we may concede to 2 partners in extreme scenarios eg. huge coding project). All partners must contribute equally.
  • Dataset:
    You must use at least one dataset with at least one half million data points as a significant part of your project.
  • Format:
    Your submission must be submitted as a pdf in NIPS format. Note that this means you must use LaTeX with their style file. (NIPS, Neural Information Processing Systems, is one of the major machine learning conferences).
    If you do not know how to use LaTeX, we reccommend finding a partner who does.
  • Code Style:
    All code used in the production of your final report should be clean (suggested format) and placed into a public GitHub repository under one of your partner's accounts. Place a footnote to this URL somewhere in your final pdf. This is not required but it is recommended to place your code under some open-source license such as MIT.

Due Dates:


  • May 27:  Project Proposal
    Students should select one particular direction (may use the paper provided under "Syllabus" tab), including existing data analytics and ML methods for analyzing UAV, cell phone, and other robotic data, that is interesting to the team to do further investigation which may lead to the final project. Typed (LaTeX) one page maximum explaining your problem, what data sets you are likely to use (you must find some candidates), who your partners are, and what methods (of those you know of) you think you might use. Note that this is not 100% final but it should be within some epsilon of your final project.

  • June 7:  Midterm Progress Report
    Submit the progress report detailing your progress towards your goal. Typed (LaTeX) summarizing your literature search, specifying what data sets you are using, and what methods you are applying. The write-up should be 3 to 5 pages for a 1 person group, 6 to 8 pages for a 2 person group and 8 to 10 for a 3 person group.

  • June 26:   Draft of Final Project Submission
    Typed (LaTex) draft of final report and all of the codes written need to be submitted. The draft needs to detail the progress of the final project, which is expected to be a significant amount. The draft is used to demonstrate what you have done so far and show that you are ready for the final presentation. It does not need to follow the NIPS format (which is required for the final version). The code does not need to be super clean and organized for this draft submission, but it is expected to be cleaned up for final submission. You do not need to have the presentation slides ready for this submission.

  • June 28: Final Project Presentation
    Presentation should as detailed as possible, and it should be about 10 minutes to half an hour long. 10-12 minute presentation (plus 3 minutes for questions)

  • July 1 (tentatively): Final Project Submission
    Submission of the final project should be done electronically. It must include:
    • (1) The dataset;
    • (2) All codes written for the project;
    • (3) The Latex final report following NIPS format (6 to 8 pages for a 1 person group, 12 to 15 pages for a 2 person group and 15 to 20 for a 3 person group);
    • (4) All .tex files with figures, references etc. that generate the .pdf files for midterm report, presentation slides and final report;
    • (5) Any other files used.
    Only one copy of each item need be turned in per group. Must conform to the requirements above. If the dataset is too large to upload it to the Github, please contact instructor or TA for submission of the dataset.