This page contains information about projects of the applied inductive learning course.

Planning

Tutorial October 2, 2014 B28, 2.93
QA October 16, 2014 13:30 GMT+2 B28, 2.93
Project October 26, 2014 23:59 GMT+2 Project 1 - Classification algorithms, sources
QA November 13, 2014 13:30 GMT+2 B28, 2.93
QA November 20, 2014 13:30 GMT+2 B28, 2.93
Project November 23, 2014 23:59 GMT+1 Project 2 - bias and variance analysis (deadline extended to Sunday, updated November 21, 9:45 GMT + 1)
Project December 13, 2014 23:59 GMT+1 Project 3 - End of the challenge
Project December 16, 2014 23:59 GMT+1 Project 3 - Submission of the challenge report
Project December 18, 2014 14:00 GMT+1 Project 3 - Oral presentation of your challenge solution
Project December 19 Project 3 - post challenge debriefing

Report submission

Project 3

The third project is organized in the form a challenge, where you will compete against each other. We will provide you with some training data related to a given supervised learning task (activity recognition) and some test data with unknown labels in order respectively to train your model and to validate it. You can use any techniques and softwares you want to build the best possible model from the training data. During the course of the project, you will be allowed to submit your predictions on the test data and an intermediate ranking of the different teams will be provided according to their current best scores on half of the test data. At the closure of the challenge, the final ranking will be established according to the best score of each team on the second half of the data (You should thus be careful not to overfit the first half of the test data!).

To handle this challenge, we use Kaggle. You can access the competition at this address: https://inclass.kaggle.com/c/snapp.

To join, each group member first needs to create an account on Kaggle. Then, one group member needs to create a team (option "My Team" in the Dashboard at the left) and invite other group members to join his team. Submissions to the challenge should only be introduced through your team.

At the end of the challenge, we ask you also to write a report that describes the different steps of your approach and your main results. You should also send us your source code. Instructions to submit your project are the same as for the other projects.

Project 3 - oral presentation

As announced, we will have the oral presentations for the third project next Thursday in room 2.93. Because there is an important departmental meeting at 15h30, we would like to start the presentations at 12h30 (you can bring food with you in the room). You are all expected to attend all presentations. If you can not join at that time, please let us know as soon as possible. We will then schedule your talk later in the afternoon.

Each group presentation should last at most 8-10 minutes and will be followed by a couple of questions. The structure of your presentation is free. You are expected of course to explain the method you have used for your final submission but you can also talk about things that have not worked, difficulties you met in the course of the project, any idea you have for potential improvements but could not be implemented due to a lack of time, etc.

We encourage you to prepare slides for your presentation. To avoid loosing time between presentations, please try to send your slides (preferably in pdf) to Arnaud Joly before Thursday at 12h00. If not possible, bring a usb key with you.

More ressouces

FAQ

Project 1

Which Python version can I use for the project?
You can use either Python 2.7, 3.3 or 3.4. In order to run smoothly your code with Python 2 and 3, you can avoid Python 2+ or 3+ only features and add the following import to the top of the file:
# Only py3 / so that 2 / 3 = 0.66..
from __future__ import division
# Only py3 string encoding
from __future__ import unicode_literals
# Only py3 print
from __future__ import print_function
How can I plot finer boundary plots?
To refine the mesh used for the plot, set the mesh_step_size from plot_boundary to a lower value.
How can I call functions from data.py?
First go with your terminal / IDE to the directory where data.py is saved. Then you can execute in a Python terminal:
>>> from data import make_cross
>>> X, y = make_cross(random_state=0)
Similarly, you will be able to execute the following script
from data import make_cross

if __name__ == "__main__":
    X, y = make_cross(random_state=0)
On linux and MacOS, how can I know that I am using the right Python interpreter?
You can simply do
$ which python
How can I smooth the error curves?
You can average the scores / errors from different training and testing sets. In this project, make sure that you draw the appropriate number of points each time. Furthermore, points from the curve must be comparable by setting appropriately the random_state.
Which learning rate strategy should I use for the sgd algorithm?
As stated in the assignement, we will use a constant learning rate.
What is counted in the 4 pages report?
Everything from the first page to the last page of your pdf report. Appendix are also counted in the pages count. The report must be in A4 page size, single column format and appropriate typography, (e.g. a font size of 11-12pt with appropriate spacing). Figures should also be of appropriate size and design for a scientific report.

Project 2

Should I compute the bias or the squared bias?
The assignement has been updated to make it clear that you should compute the squared bias.