Develop a model from scratch¶
This tutorial explains how to develop a AI4OS module from scratch on your local machine.
You could also use the AI4OS Development environment from the AI4OS Dashboard if you want to develop your code in a ready made environment based on some predefined Docker container (eg. official Tensorflow or Pytorch containers). The tutorial still applies the same. You only need to go to the Dashboard, select the AI4OS Development environment and configure the Docker image and resources you want to use (see video demo).
If you are new to Machine Learning, you might want to check some useful Machine Learning resources we compiled to help you getting started.
1. Setting the framework¶
This first step relies on the the AI4OS Modules Template for creating a template for your new module:
Then select the
masterbranch of the template and answer the questions.
Generateand you will be able to download a
.zipfile with two project directories:
Extract them locally.
Go to Github and create the corresponding repositories:
git push origin --allin both extracted directories to put your initial code in Github.
Install your project as a Python module in editable mode (so that the changes you make to the codebase are picked by Python).
$ cd <project-name>
$ pip install -e .
Now you can start writing your code.
Some users have reported issues in some systems when installing
deepaas (which is always present in the
requirements.txt of your project). Those issues have been resolved as following:
In Pytorch Docker images, making sure
gccis installed (
apt install gcc)
In other systems, sometimes
python3-devis needed (
apt install python3-dev).
To be able to interface with DEEPaaS you have to define in
api.py the functions you want to make accessible to the user. For this tutorial we are going to head to our official demo module and copy-paste its
Once this is done, check that DEEPaaS is interfacing correctly by running:
$ deepaas-run --listen-ip 0.0.0.0
Your module should be visible in http://0.0.0.0:5000/ui . If you don’t see your module, you probably messed the
api.py file. Try running it with python so you get a more detailed debug message.
$ python api.py
Remember to leave untouched the
get_metadata() function that comes predefined with your module, as all modules should have proper metadata.
You can also use port
6006 to expose some training monitoring tool, like Tensorboard.
In order to improve the readability of the code and the overall maintainability of the project, we enforce proper Python coding styles (
pep8) to all modules added to the Marketplace. Modules that fail to pass style tests won’t be able to build docker images. If you want to check if your module pass the tests, go to your project folder and type:
There you should see a detailed report of the offending lines (if any). You can always turn off flake8 testing in some parts of the code if long lines are really needed.
If your project has many offending lines, it’s recommended using a code formatter tool like Black. It also helps for having a consistent code style and minimizing git diffs. Black formatted code will always be compliant with flake8.
Once installed, you can check how Black would have reformatted your code:
$ black <code-folder> --diff
You can always turn off Black formatting if you want to keep some sections of your code untouched.
If you are happy with the changes, you can make them permanent using:
$ black <code-folder>
Remember to have a backup before reformatting, just in case!
Once you are fine with the state of
<project-name> folder, push the changes to Github.
This is the repo in charge of creating a single docker image that integrates your application, along with deepaas and any other dependency.
You need to modify the following files according to your needs:
Check the installation steps are fine. If your module needs additional Linux packages add them to the Dockerfile. Check your Dockerfile works correctly by building it locally and running it:
$ docker build --no-cache -t your_project . $ docker run -ti -p 5000:5000 -p 6006:6006 -p 8888:8888 your_project
Your module should be visible in http://0.0.0.0:5000/ui . You can make a POST request to the
predictmethod to check everything is working as intended.
This is the information that will be displayed in the Marketplace. Among the fields you might need to edit are:
title(mandatory): short title,
summary(mandatory): one liner summary of your module,
description(optional): extended description of your module, like a README,
keywords(mandatory): tags to make your module more findable
training_files_url(optional): the URL of your model weights and additional training information,
dataset_url(optional): the URL dataset URL,
cite_url(optional): the DOI URL of any related publication,
Most other fields are pre-filled via the AI4OS Modules Template and usually do not need to be modified. Check you didn’t mess up the JSON formatting by running:
$ pip install git+https://github.com/deephdc/schema4apps $ deep-app-schema-validator metadata.json
Due to some issues with the JSON format parsing avoid using
:in the values you are filling.
Once you are fine with the state of
DEEP-OC-<project-name>, push the changes to Github.
4. Integrating the module in the Marketplace¶
Once your repo is set, it’s time to make a PR to add your model to the marketplace!
For this you have to fork the code of the AI4OS catalog repo (deephdc/deep-oc) and add your Docker repo name at the end of the
- module: https://github.com/deephdc/UC-<github-user>-DEEP-OC-<project-name>
You can do this directly online on GitHub or via the command line:
$ git clone https://github.com/[my-github-fork]
$ cd [my-github-fork]
$ echo '- module: https://github.com/deephdc/UC-<github-user>-DEEP-OC-<project-name>' >> MODULES.yml
$ git commit -a -m "adding new module to the catalogue"
$ git push
Once the changes are done, make a PR of your fork to the original repo and wait for approval. Check the GitHub Standard Fork & Pull Request Workflow in case of doubt.
When your module gets approved, you may need to commit and push a change to
metadata.json in your
https://github.com/<github-user>/DEEP-OC-<project-name> so that the Pipeline is run for the first time, and your module gets rendered in the marketplace.
If you run into problems you can always check the Frequently Asked Questions (FAQ).