Examples

This section contains examples of the different types of metadata that can be used with the ai4_metadata package. The examples are provided in both YAML and JSON formats.

Version 2 (CURRENT)

 1metadata_version: 2.1.0
 2title: DEEP OC Massive Online Data Streams
 3summary: Massive Online Data Streams analysis
 4description: |
 5  This use case analyzes online data streams in order to generate alerts with time-bounded constrains and in real-time.  The main study is focused on building additional intelligent module using NN and DL techniques in co-function with underlying Intrusion Detection Systems (IDS) supervising traffic networks of compute centers.  Preserving old data for historical purposes, security analysts will be able to supervise generated alerts and to enhance cyber security [1, 2] for such centers when large IT infrastructures and devices products a huge amount of data streaming continuously and dynamically.
 6  The principle of the solution is proactive time-series prediction [5] adopting NNs as well as DL to build prediction models capable to predict next step(s) in near future based on given current and past steps.  The discrepancy between the prediction and the reality gives an indication of anomaly (i.e. anomaly detection).
 7  The challenge of the solution is it aims to scalable edge technologies [4] to support extensive data analysis and modelling as well as to improve the cyber-resilience by adopting an heuristic approach, that combines misuse detection in real-time with the building intelligence module using NN and DL.
 8  Current modelling approach using DL techniques [3]: LSTM (vanilla, stacked, bidirectional, seq2seq encoder/decoder), GRU, CNN, and MLP
 9  **References**
10  [1]: Bhattacharyya, D.K. and Kalita, J.K., 2013. Network anomaly detection: A machine learning perspective. Chapman and Hall/CRC.
11  [2]: Dua, S. and Du, X., 2016. Data mining and machine learning in cybersecurity. Auerbach Publications.
12  [3]: Yann LeCun, Yoshua Bengio, and Geofrey Hinton. [Deep learning](https://www.cs.toronto.edu/~hinton/absps/NatureDeepReview.pdf). Nature, 521(7553):436–444, may 2015.
13  [4]: Nguyen, G., Nguyen, B.M., Tran, D. and Hluchy, L., 2018. A heuristics approach to mine behavioural data logs in mobile malware detection system. Data & Knowledge Engineering, 115, pp.129-151.
14  [5]: Tran, N., Nguyen, T., Nguyen, B.M. and Nguyen, G., 2018. A Multivariate Fuzzy Time Series Resource Forecast Model for Clouds using LSTM and Data Correlation Analysis. Procedia Computer Science, 126, pp.636-645.
15libraries:
16  - TensorFlow
17  - Keras
18tasks:
19  - Anomaly Detection
20  - Time Series
21categories:
22  - AI4 trainable
23data-type:
24  - Tabular
25tags:
26  - security
27  - networks
28  - cybersecurity
29  - anomaly detection
30  - time-series prediction
31dates:
32  created: "2019-02-19"
33  updated: "2024-02-19"
34links:
35  source_code: "https://github.com/deephdc/DEEP-OC-mods"
36  docker_image: "https://hub.docker.com/r/deephdc/deep-oc-mods"
37resources:
38  inference:
39    gpu: 1
40    memory: 16GB
41    cpu: 4
 1{
 2    "metadata_version": "2.1.0",
 3    "title": "DEEP OC Massive Online Data Streams",
 4    "summary": "Massive Online Data Streams analysis",
 5    "description": "This use case analyzes online data streams in order to generate alerts with time-bounded constrains and in real-time.  The main study is focused on building additional intelligent module using NN and DL techniques in co-function with underlying Intrusion Detection Systems (IDS) supervising traffic networks of compute centers.  Preserving old data for historical purposes, security analysts will be able to supervise generated alerts and to enhance cyber security [1, 2] for such centers when large IT infrastructures and devices products a huge amount of data streaming continuously and dynamically.\n The principle of the solution is proactive time-series prediction [5] adopting NNs as well as DL to build prediction models capable to predict next step(s) in near future based on given current and past steps.  The discrepancy between the prediction and the reality gives an indication of anomaly (i.e. anomaly detection).\n The challenge of the solution is it aims to scalable edge technologies [4] to support extensive data analysis and modelling as well as to improve the cyber-resilience by adopting an heuristic approach, that combines misuse detection in real-time with the building intelligence module using NN and DL.\n Current modelling approach using DL techniques [3]: LSTM (vanilla, stacked, bidirectional, seq2seq encoder/decoder), GRU, CNN, and MLP\n **References**\n [1]: Bhattacharyya, D.K. and Kalita, J.K., 2013. Network anomaly detection: A machine learning perspective. Chapman and Hall/CRC.\n [2]: Dua, S. and Du, X., 2016. Data mining and machine learning in cybersecurity. Auerbach Publications.\n [3]: Yann LeCun, Yoshua Bengio, and Geofrey Hinton. [Deep learning](https://www.cs.toronto.edu/~hinton/absps/NatureDeepReview.pdf). Nature, 521(7553):436–444, may 2015.\n [4]: Nguyen, G., Nguyen, B.M., Tran, D. and Hluchy, L., 2018. A heuristics approach to mine behavioural data logs in mobile malware detection system. Data & Knowledge Engineering, 115, pp.129-151.\n [5]: Tran, N., Nguyen, T., Nguyen, B.M. and Nguyen, G., 2018. A Multivariate Fuzzy Time Series Resource Forecast Model for Clouds using LSTM and Data Correlation Analysis. Procedia Computer Science, 126, pp.636-645.\n",
 6    "libraries": ["TensorFlow", "Keras"],
 7    "tasks": ["Anomaly Detection", "Time Series"],
 8    "categories": ["AI4 trainable"],
 9    "data-type": ["Tabular"],
10    "tags": ["security", "networks", "cybersecurity", "anomaly detection", "time-series prediction"],
11    "dates": {
12        "created": "2019-02-19",
13        "updated": "2024-02-19"
14    },
15    "links": {
16        "source_code": "https://github.com/deephdc/DEEP-OC-mods",
17        "docker_image": "https://hub.docker.com/r/deephdc/deep-oc-mods"
18    },
19    "resources": {
20      "inference": {
21        "gpu": 1,
22        "memory": "16GB",
23        "cpu": 4
24      }
25    }
26}
 1{
 2  "metadata_version": "2.0.0",
 3  "title": "Phytoplankton species classifier (VLIZ)",
 4  "summary": "Identify the species level of Plankton for 95 classes.",
 5  "description": "Phytoplankton species classifier is an application to classify phytoplankton, features DEEPaaS API.\nProvided by [VLIZ (Flanders Marine Institute)](https://www.vliz.be/nl).\n\nPlankton, though small, plays a critical role in marine ecosystems. Identifying plankton species is vital for understanding ecosystem health, predicting harmful algal blooms, and managing marine environments.\nThe FlowCam, a technology capturing high-resolution images of plankton, coupled with artificial intelligence (AI), has revolutionized plankton identification.\n\nFlowCam's ability to swiftly capture and analyze plankton images has transformed the once time-consuming process of identification.\nWhen integrated with AI, this technology can rapidly categorize and identify plankton species with remarkable accuracy, providing a comprehensive understanding of marine communities.\n\nThe benefits are numerous: real-time monitoring of marine environments, early detection of changes, support for conservation efforts, and contributing valuable data for research and policy decisions.\nAI-driven plankton identification is a game-changer, offering a powerful tool for researchers.\n\nThis Docker container contains a trained Convolutional Neural network optimized for phytoplankton identification using images. The architecture used is an Xception [1] network using Keras on top of Tensorflow.\n\nThe PREDICT method expects an RGB image as input (or the URL of an RGB image) and will return a JSON with the top 5 predictions.\nAs a training dataset, we have used a collection of images from [VLIZ](https://www.vliz.be/nl) which consists of 350K images from 95 classes of phytoplankton.\n\nThanks to this module, the user has a couple of options:\n1. Users can use the existing model to predict phytoplankton species if it's part of one of our classes (see Zenodo).\n2. Users can upload their own data (i.e., images and datasplit files) on Nextcloud and train their new CNN to predict new classes.\n3. Users can transform and augment their images to explore new type of models.\n\n<img class='fit', src='https://raw.githubusercontent.com/ai4os-hub/phyto-plankton-classification/main/references/plankton_img.png'/>\n\n**References**\n1. Yann LeCun, Yoshua Bengio, and Geofrey Hinton. [Deep learning](https://www.cs.toronto.edu/~hinton/absps/NatureDeepReview.pdf). Nature, 521(7553):436-444, May 2015.\n\nThis module is largely based on the [existing image classification module](https://github.com/ai4os-hub/ai4os-image-classification-tf) made by [Ignacio Heredia](https://github.com/IgnacioHeredia)",
 6  "dates": {
 7    "created": "2024-08-22",
 8    "updated": "2025-01-28"
 9  },
10  "links": {
11    "ai4_template": "ai4-template/2.1.4",
12    "source_code": "https://github.com/ai4os-hub/phyto-plankton-classification",
13    "docker_image": "ai4oshub/phyto-plankton-classification",
14    "dataset": "https://zenodo.org/records/10554845",
15    "documentation": "https://github.com/ai4os-hub/phyto-plankton-classification/blob/main/references/README_marketplace.md",
16    "citation": "https://www.vliz.be/en",
17    "cicd_url": "https://jenkins.services.ai4os.eu/job/ai4os-hub/job/phyto-plankton-classification/job/main/",
18    "cicd_badge": "https://jenkins.services.ai4os.eu/buildStatus/icon?job=ai4os-hub/phyto-plankton-classification/main",
19    "self": "https://api.cloud.ai4eosc.eu/v1/catalog/modules/phyto-plankton-classification/metadata"
20  },
21  "tags": [
22    "deep learning",
23    "vo.imagine-ai.eu"
24  ],
25  "tasks": [
26    "Computer Vision",
27    "Classification"
28  ],
29  "categories": [
30    "AI4 pre trained",
31    "AI4 trainable",
32    "AI4 inference"
33  ],
34  "libraries": [
35    "TensorFlow",
36    "Keras"
37  ],
38  "data-type": [
39    "Image"
40  ],
41  "license": "Apache-2.0",
42  "id": "phyto-plankton-classification"
43}

Version 1 (DEPRECATED)

 1---
 2title: DEEP OC Massive Online Data Streams
 3summary: Massive Online Data Streams analysis
 4description:
 5- This use case analyzes online data streams in order to generate alerts with time-bounded
 6  constrains and in real-time.
 7- The main study is focused on building additional intelligent module using NN and
 8  DL techniques
 9- in co-function with underlying Intrusion Detection Systems (IDS) supervising traffic
10  networks of compute centers.
11- Preserving old data for historical purposes, security analysts will be able to supervise
12  generated alerts
13- and to enhance cyber security [1, 2] for such centers when large IT infrastructures
14  and devices
15- products a huge amount of data streaming continuously and dynamically.\n
16- The principle of the solution is proactive time-series prediction [5] adopting NNs
17  as well as DL to build
18- prediction models capable to predict next step(s) in near future based on given
19  current and past steps.
20- The discrepancy between the prediction and the reality gives an indication of anomaly
21  (i.e. anomaly detection).\n
22- ''
23- The challenge of the solution is it aims to scalable edge technologies [4] to support
24- extensive data analysis and modelling as well as to improve the cyber-resilience
25  by adopting an heuristic approach,
26- that combines misuse detection in real-time with the building intelligence module
27  using NN and DL.\n
28- ''
29- 'Current modelling approach using DL techniques [3]:'
30- LSTM (vanilla, stacked, bidirectional, seq2seq encoder/decoder), GRU, CNN, and MLP\n
31- "**References**"
32- "[1]: Bhattacharyya, D.K. and Kalita, J.K., 2013. Network anomaly detection: A machine
33  learning perspective. Chapman and Hall/CRC."
34- "[2]: Dua, S. and Du, X., 2016. Data mining and machine learning in cybersecurity.
35  Auerbach Publications."
36- "[3]: Yann LeCun, Yoshua Bengio, and Geofrey Hinton. [Deep learning](https://www.cs.toronto.edu/~hinton/absps/NatureDeepReview.pdf).
37  Nature, 521(7553):436–444, may 2015."
38- "[4]: Nguyen, G., Nguyen, B.M., Tran, D. and Hluchy, L., 2018. A heuristics approach
39  to mine behavioural data logs in mobile malware detection system. Data & Knowledge
40  Engineering, 115, pp.129-151."
41- "[5]: Tran, N., Nguyen, T., Nguyen, B.M. and Nguyen, G., 2018. A Multivariate Fuzzy
42  Time Series Resource Forecast Model for Clouds using LSTM and Data Correlation Analysis.
43  Procedia Computer Science, 126, pp.636-645."
44keywords:
45- services
46- docker
47license: Apache 2.0
48date_creation: '2019-02-19'
49sources:
50  dockerfile_repo: https://github.com/deephdc/DEEP-OC-mods
51  docker_registry_repo: deephdc/deep-oc-mods
52  code: https://github.com/deephdc/mods
53tosca:
54- title: Mesos (CPU)
55  url: https://raw.githubusercontent.com/indigo-dc/tosca-templates/master/deep-oc/deep-oc-mods-mesos-cpu.yml
56  inputs:
57  - rclone_conf
58  - rclone_url
59  - rclone_vendor
60  - rclone_user
61  - rclone_pass
 1{
 2    "title": "DEEP OC Massive Online Data Streams",
 3    "summary": "Massive Online Data Streams analysis",
 4    "description": [
 5		"This use case analyzes online data streams in order to generate alerts with time-bounded constrains and in real-time.",
 6		"The main study is focused on building additional intelligent module using NN and DL techniques",
 7		"in co-function with underlying Intrusion Detection Systems (IDS) supervising traffic networks of compute centers.",
 8		"Preserving old data for historical purposes, security analysts will be able to supervise generated alerts",
 9		"and to enhance cyber security [1, 2] for such centers when large IT infrastructures and devices",
10		"products a huge amount of data streaming continuously and dynamically.\\n",
11		"The principle of the solution is proactive time-series prediction [5] adopting NNs as well as DL to build",
12		"prediction models capable to predict next step(s) in near future based on given current and past steps.",
13		"The discrepancy between the prediction and the reality gives an indication of anomaly (i.e. anomaly detection).\\n",
14		"",
15		"The challenge of the solution is it aims to scalable edge technologies [4] to support",
16		"extensive data analysis and modelling as well as to improve the cyber-resilience by adopting an heuristic approach,",
17		"that combines misuse detection in real-time with the building intelligence module using NN and DL.\\n",
18		"",
19		"Current modelling approach using DL techniques [3]:",
20		"LSTM (vanilla, stacked, bidirectional, seq2seq encoder/decoder), GRU, CNN, and MLP\\n",
21		"**References**",
22		"[1]: Bhattacharyya, D.K. and Kalita, J.K., 2013. Network anomaly detection: A machine learning perspective. Chapman and Hall/CRC.",
23		"[2]: Dua, S. and Du, X., 2016. Data mining and machine learning in cybersecurity. Auerbach Publications.",
24		"[3]: Yann LeCun, Yoshua Bengio, and Geofrey Hinton. [Deep learning](https://www.cs.toronto.edu/~hinton/absps/NatureDeepReview.pdf). Nature, 521(7553):436–444, may 2015.",
25		"[4]: Nguyen, G., Nguyen, B.M., Tran, D. and Hluchy, L., 2018. A heuristics approach to mine behavioural data logs in mobile malware detection system. Data & Knowledge Engineering, 115, pp.129-151.",
26		"[5]: Tran, N., Nguyen, T., Nguyen, B.M. and Nguyen, G., 2018. A Multivariate Fuzzy Time Series Resource Forecast Model for Clouds using LSTM and Data Correlation Analysis. Procedia Computer Science, 126, pp.636-645."
27	],
28    "keywords": [
29        "services",
30        "docker"
31    ],
32    "license": "Apache 2.0",
33    "date_creation": "2019-02-19",
34    "sources": {
35		"dockerfile_repo": "https://github.com/deephdc/DEEP-OC-mods",
36		"docker_registry_repo": "deephdc/deep-oc-mods",
37		"code": "https://github.com/deephdc/mods"
38	},
39    "tosca": [
40        {
41            "title": "Mesos (CPU)",
42            "url": "https://raw.githubusercontent.com/indigo-dc/tosca-templates/master/deep-oc/deep-oc-mods-mesos-cpu.yml",
43            "inputs": [
44                "rclone_conf",
45                "rclone_url",
46                "rclone_vendor",
47                "rclone_user",
48                "rclone_pass"
49            ]
50        }
51    ]
52}