User documentation¶
Useful links
A high level overview of the project.The main source of knowledge on how to use the project. Refer always to here in case of doubt.The authentication management for accessing the AI4OS stack.Where users will typically search for modules developed by the community, and find the relevant pointers to use them. It allows authenticated users to deploy virtual machines on specific hardware (eg. gpus) to train a module.The service that allows to store your data remotely and access them from inside your deployment. (old instance)The code of the software powering the platform.The code of all the modules available in the platform.Where the Docker images of the modules are stored.Custom Docker image registry we deployed to overcome DockerHub limitations.Continuous Integration and Continuous Development Jenkins instance to keep everything up-to-date with latest code changes.Check if a specific AI4OS service might be down for some reason.Create new modules based on our project’s template.Log your trainings parameters and models with our MLflow server.Scalable serverless inference of AI models.Compose custom AI workflows.
New to the project? How about a quick dive?
Overview¶
A more in depth documentation, with detailed description on the architecture and components is provided in the following sections.
How-to’s¶
Use a model (basic user)¶
Train a model (intermediate user)¶
- Train a model locally
- Train a model with the Dashboard
- 1. Choose a module from the Marketplace
- 2. Upload your files to Nextcloud
- 3. Deploy with the Training Dashboard
- 4. Go to JupyterLab and mount your dataset
- 5. Open the DEEPaaS API and train the model
- 6. Test and export the newly trained model
- 7. Create a Docker repo for your new module
- 8. Share your new module in the Marketplace
- 9. [optional] Add your new module to the original Continuous Integration pipeline
- 10. Next steps
- Use rclone to sync your dataset
- Use MLFlow for tracking your trainings
Develop a model (advanced user)¶
Use a tool (intermediate user)¶
We have specific guides for each of the tools.
Others¶
- Frequently Asked Questions (FAQ)
- 🔥 Service X is not working
- 🔥 The Dashboard says I only have 500 MB of disk in my deployment
- 🔥 I ran out of disk in my deployment
- 🔥 I cannot access
/storage
- 🔥 rclone fails to connect
- 🔥 My deployment does not correctly list my resources
- 🔥 My GPU just disappeared from my deployment
- 🔥 I delete my deployment but it keeps reappearing
- 🚀 I would like to suggest a new feature
- Useful Machine Learning resources
- Video demos