diff --git a/README.md b/README.md index ab2085c4..79e18a26 100644 --- a/README.md +++ b/README.md @@ -9,11 +9,11 @@ - [Deploying Models to Production](#deploying-models-to-production) - [Visualization, Monitoring, and Logging](#visualization-monitoring-and-logging) - [End-to-End Use-Case Applications](#end-to-end-use-case-applications) - - [Smart Stock Trading](demos/stocks/read_stocks.ipynb) + - [Smart Stock Trading](demos/stocks/read-stocks.ipynb) - [Predictive Infrastructure Monitoring](demos/netops/generator.ipynb) - - [Image Recognition](demos/image_classification/keras-cnn-dog-or-cat-classification.ipynb) + - [Image Recognition](demos/image-classification/keras-cnn-dog-or-cat-classification.ipynb) - [Natural Language Processing (NLP)](demos/nlp/nlp-example.ipynb) - - [Streaming Enrichment](demos/streaming-enrichment/Streaming-enrichment.ipynb) + - [Stream Enrichment](demos/stream-enrich/stream-enrich.ipynb) - [Jupyter Notebook Basics](#jupyter-notebook-basics) - [Creating Virtual Environments in Jupyter Notebook](#creating-virtual-environments-in-jupyter-notebook) - [Additional Resources](#additional-resources) @@ -53,7 +53,7 @@ For a more in-depth introduction to the platform, see the following resources: A good place to start your development is with the platform [tutorial Jupyter notebooks](https://github.com/v3io/tutorials). -- The [**GettingStarted**](GettingStarted/collect-n-explore.ipynb) directory contains information and code examples to help you quickly get started using the platform. +- The [**getting-started**](getting-started/collect-n-explore.ipynb) directory contains information and code examples to help you quickly get started using the platform. - The [**demos**](demos/README.ipynb) directory contains full end-to-end use-case application demos. @@ -75,16 +75,16 @@ There are many ways to collect and ingest data from various sources into the pla - Streaming data in real time from sources such as Kafka, Kinesis, Azure Event Hubs, or Google Pub/Sub. - Loading data directly from external databases using an event-driven or periodic/scheduled implementation. - See the explanation and examples in the [**ReadingFromExternalDB**](GettingStarted/ReadingFromExternalDB.ipynb) tutorial. + See the explanation and examples in the [**read-external-db**](getting-started/read-external-db.ipynb#ingest-from-external-db-to-no-sql-using-frames) tutorial. - Loading files (objects), in any format (for example, CSV, Parquet, JSON, or a binary image), from internal or external sources such as Amazon S3 or Hadoop. - See, for example, the [**FilesAccess**](GettingStarted/FilesAccess.ipynb) tutorial. + See, for example, the [**file-access**](getting-started/file-access.ipynb) tutorial. - Importing time-series telemetry data using a Prometheus compatible scraping API. - Ingesting (writing) data directly into the system using RESTful AWS-like simple-object, streaming, or NoSQL APIs. See the platform's [Web-API References](https://www.iguazio.com/docs/reference/latest-release/api-reference/web-apis/). - Scraping or reading data from external sources — such as Twitter, weather services, or stock-trading data services — using serverless functions. - See, for example, the [**stocks**](demos/stocks/read_stocks.ipynb) demo use-case application. + See, for example, the [**stocks**](demos/stocks/read-stocks.ipynb) demo use-case application. -For more information and examples of data collection and ingestion wcollect-n-exploreith the platform, see the [**collect-n-explore**](GettingStarted/collect-n-explore.ipynb#gs-data-collection-and-ingestion) tutorial Jupyter notebook. +For more information and examples of data collection and ingestion wcollect-n-exploreith the platform, see the [**collect-n-explore**](getting-started/collect-n-explore.ipynb#gs-data-collection-and-ingestion) tutorial Jupyter notebook. ### Exploring and Processing Data @@ -92,13 +92,13 @@ For more information and examples of data collection and ingestion wcollect-n-ex The platform includes a wide range of integrated open-source data query and exploration tools, including the following: - [Apache Spark](https://spark.apache.org/) data-processing engine — including the Spark SQL and Datasets, MLlib, R, and GraphX libraries — with real-time access to the platform's NoSQL data store and file system. - See the platform's [Spark APIs reference](https://www.iguazio.com/docs/reference/latest-release/api-reference/spark-apis/) and the examples in the [**SparkSQLAnalytics**](GettingStarted/SparkSQLAnalytics.ipynb) tutorial. + See the platform's [Spark APIs reference](https://www.iguazio.com/docs/reference/latest-release/api-reference/spark-apis/) and the examples in the [**spark-sql-analytics**](getting-started/spark-sql-analytics.ipynb) tutorial. - [Presto](http://prestodb.github.io/) distributed SQL query engine, which can be used to run interactive SQL queries over platform NoSQL tables or other object (file) data sources. See the platform's [Presto reference](https://www.iguazio.com/docs/reference/latest-release/presto/). - [pandas](https://pandas.pydata.org/) Python analysis library, including structured DataFrames. - [Dask](https://dask.org/) parallel-computing Python library, including scaled pandas DataFrames. - [V3IO Frames](https://github.com/v3io/frames) — Iguazio's open-source data-access library, which provides a unified high-performance API for accessing NoSQL, stream, and time-series data in the platform's data store and features native integration with pandas and [NVIDIA RAPIDS](https://rapids.ai/). - See, for example, the [**frames**](GettingStarted/frames.ipynb) tutorial. + See, for example, the [**frames**](getting-started/frames.ipynb) tutorial. - Built-in support for ML packages such as [scikit-learn](https://scikit-learn.org), [Pyplot](https://matplotlib.org/api/_as_gen/matplotlib.pyplot.html), [NumPy](http://www.numpy.org/), [PyTorch](https://pytorch.org/), and [TensorFlow](https://www.tensorflow.org/). All these tools are integrated with the platform's Jupyter Notebook service, allowing users to access the same data from Jupyter through different interfaces with minimal configuration overhead. @@ -107,7 +107,7 @@ This design, coupled with the platform's unified data model, enables users to st > **Note:** You can deploy and manage application services, such as Spark and Jupyter Notebook, from the **Services** page of the platform dashboard. -For more information and examples of data exploration with the platform, see the [**collect-n-explore**](GettingStarted/collect-n-explore.ipynb#gs-data-exploration-and-processing) tutorial Jupyter notebook. +For more information and examples of data exploration with the platform, see the [**collect-n-explore**](getting-started/collect-n-explore.ipynb#gs-data-exploration-and-processing) tutorial Jupyter notebook. ### Building and Training Models @@ -117,7 +117,7 @@ When your model is ready, you can train it in Jupyter Notebook or by using scala You can find model-training examples in the platform's tutorial Jupyter notebooks: - The [NetOps demo](demos/netops/training.ipynb) tutorial demonstrates predictive infrastructure-monitoring using scikit-learn. -- The [image-classification demo](demos/image_classification/infer.ipynb) tutorial demonstrates image recognition using TensorFlow and Keras. +- The [image-classification demo](demos/image-classification/infer.ipynb) tutorial demonstrates image recognition using TensorFlow and Keras. If you're are a beginner, you might find the following ML guide useful — [Machine Learning Algorithms In Layman's Terms](https://towardsdatascience.com/machine-learning-algorithms-in-laymans-terms-part-1-d0368d769a7b). @@ -136,7 +136,7 @@ For detailed information about Nuclio, visit the [Nuclio web site](https://nucli > **Note:** Nuclio functions aren't limited to model serving: they can automate data collection, serve custom APIs, build real-time feature vectors, drive triggers, and more. For an overview of Nuclio and how to develop, document, and deploy serverless Python Nuclio functions from Jupyter Notebook, see the [nuclio-jupyter documentation](https://github.com/nuclio/nuclio-jupyter/blob/master/README.md). -You can also find examples in the platform tutorial Jupyter notebooks; for example, the [NetOps demo](demos/netops/nuclio_infer.ipynb) tutorial demonstrates how to deploy a network-operations model as a function. +You can also find examples in the platform tutorial Jupyter notebooks; for example, the [NetOps demo](demos/netops/infer.ipynb) tutorial demonstrates how to deploy a network-operations model as a function. ### Visualization, Monitoring, and Logging @@ -158,11 +158,11 @@ For information on how to create Grafana dashboards to monitor and visualize dat Iguazio provides full end-to-end use-case applications that demonstrate how to use the Iguazio Data Science Platform and related tools to address data science requirements for different industries and implementations. The applications are provided in the **demos** directory of the platform's tutorial Jupyter notebooks and cover the following use cases; for more detailed descriptions, see the demos README ([notebook](demos/README.ipynb) / [Markdown](demos/README.md)): -- **Smart stock trading** ([**stocks**](demos/stocks/read_stocks.ipynb)) — the application reads stock-exchange data from an internet service into a time-series database (TSDB); uses Twitter to analyze the market sentiment on specific stocks, in real time; and saves the data to a platform NoSQL table that is used for generating reports and analyzing and visualization the data in a Grafana dashboard. +- **Smart stock trading** ([**stocks**](demos/stocks/read-stocks.ipynb)) — the application reads stock-exchange data from an internet service into a time-series database (TSDB); uses Twitter to analyze the market sentiment on specific stocks, in real time; and saves the data to a platform NoSQL table that is used for generating reports and analyzing and visualization the data in a Grafana dashboard. - **Predictive infrastructure monitoring** ([**netops**](demos/netops/generator.ipynb)) — the application builds, trains, and deploys a machine-learning model for analyzing and predicting failure in network devices as part of a network operations (NetOps) flow. The goal is to identify anomalies for device metrics — such as CPU, memory consumption, or temperature — which can signify an upcoming issue or failure. -- **Image recognition** ([**image_classification**](demos/image_classification/keras-cnn-dog-or-cat-classification.ipynb)) — the application builds and trains an ML model that identifies (recognizes) and classifies images by using Keras, TensorFlow, and scikit-learn. +- **Image recognition** ([**image-classification**](demos/image-classification/keras-cnn-dog-or-cat-classification.ipynb)) — the application builds and trains an ML model that identifies (recognizes) and classifies images by using Keras, TensorFlow, and scikit-learn. - **Natural language processing (NLP)** ([**nlp**](demos/nlp/nlp-example.ipynb)) — the application processes natural-language textual data — including spelling correction and sentiment analysis — and generates a Nuclio serverless function that translates any given text string to another (configurable) language. -- **Streaming enrichment** ([**streaming-enrichment**](demos/streaming-enrichment/Streaming-enrichment.ipynb)) — the application demonstrates a typical stream-based data-engineering pipeline, which is required in many real-world scenarios: data is streamed from an event streaming engine; the data is enriched, in real time, using data from a NoSQL table; the enriched data is saved to an output data stream and then consumed from this stream. +- **Stream enrichment** ([**stream-enrich**](demos/stream-enrich/stream-enrich.ipynb)) — the application demonstrates a typical stream-based data-engineering pipeline, which is required in many real-world scenarios: data is streamed from an event streaming engine; the data is enriched, in real time, using data from a NoSQL table; the enriched data is saved to an output data stream and then consumed from this stream. ## Jupyter Notebook Basics @@ -183,18 +183,18 @@ The root file-browser directory of the platform's Jupyter Notebook service conta - The contents of the running-user home directory — **users/<running user>**. This directory contains the platform's [tutorial Jupyter notebooks](https://github.com/v3io/tutorials): - - [**Welcome.ipynb**](../Welcome.ipynb) — a documentation notebook that provides a short introduction to the platform and how to use it to implement a full data science workflow. - - **GettingStarted** — a directory containing getting-started tutorials that explain and demonstrate how to perform basic platform operations — such as data collection, ingestion, and analysis — as detailed in the current notebook. - - **demos** — a directory containing [end-to-end application use-case demos](../demos/README.ipynb). + - [**welcome.ipynb**](../welcome.ipynb) / [**README.md**](../README.md) — the current document, which provides a short introduction to the platform and how to use it to implement a full data science workflow. + - **getting-started** — a directory containing getting-started tutorials that explain and demonstrate how to perform different platform operations using the platform APIs and integrated tools. + - **demos** — a directory containing [end-to-end application use-case demos](#end-to-end-use-case-applications). -For information about the predefined data containers and how to reference data in these containers, see [Platform Data Containers](GettingStarted/collect-n-explore.ipynb/#platform-data-containers) in the **collect-n-explore** tutorial notebook. +For information about the predefined data containers and how to reference data in these containers, see [Platform Data Containers](getting-started/collect-n-explore.ipynb/#platform-data-containers) in the **collect-n-explore** tutorial notebook. ### Creating Virtual Environments in Jupyter Notebook A virtual environment is a named, isolated, working copy of Python that maintains its own files, directories, and paths so that you can work with specific versions of libraries or Python itself without affecting other Python projects. Virtual environments make it easy to cleanly separate projects and avoid problems with different dependencies and version requirements across components. -See the [CondaVirtualEnv](GettingStarted/CondaVirtualEnv.ipynb) tutorial notebook for step-by-step instructions for using conda to create your own Python virtual environments, which will appear as custom kernels in Jupyter Notebook. +See the [virutal-env](getting-started/virutal-env.ipynb) tutorial notebook for step-by-step instructions for using conda to create your own Python virtual environments, which will appear as custom kernels in Jupyter Notebook. ## Additional Resources diff --git a/demos/README.ipynb b/demos/README.ipynb index 8a59d35e..d0863c21 100644 --- a/demos/README.ipynb +++ b/demos/README.ipynb @@ -18,7 +18,7 @@ "- [Predictive Infrastructure Monitoring](#netops-demo)\n", "- [Image Recognition](#image-classification-demo)\n", "- [Natural Language Processing (NLP)](#nlp-demo)\n", - "- [Streaming Enrichment](#streaming-enrichment-demo)" + "- [Stream Enrichment](#stream-enrich-demo)" ] }, { @@ -38,7 +38,7 @@ "\n", "## Smart Stock Trading\n", "\n", - "The [**stocks**](stocks/read_stocks.ipynb) demo demonstrates a smart stock-trading application: \n", + "The [**stocks**](stocks/read-stocks.ipynb) demo demonstrates a smart stock-trading application: \n", "the application reads stock-exchange data from an internet service into a time-series database (TSDB); uses Twitter to analyze the market sentiment on specific stocks, in real time; and saves the data to a platform NoSQL table that is used for generating reports and analyzing and visualization the data in a Grafana dashboard.\n", "\n", "- The stock data is read from Twitter by using the [TwythonStreamer](https://twython.readthedocs.io/en/latest/usage/streaming_api.html) Python wrapper to the Twitter Streaming API, and saved to TSDB and NoSQL tables in the platform.\n", @@ -70,7 +70,7 @@ "\n", "## Image Recognition\n", "\n", - "The [**image_classification**](image_classification/keras-cnn-dog-or-cat-classification.ipynb) demo demonstrates image recognition: the application builds and trains an ML model that identifies (recognizes) and classifies images.\n", + "The [**image-classification**](image-classification/keras-cnn-dog-or-cat-classification.ipynb) demo demonstrates image recognition: the application builds and trains an ML model that identifies (recognizes) and classifies images.\n", "\n", "- The data is collected by downloading images of dogs and cats from the Iguazio sample data-set AWS bucket.\n", "- The training data for the ML model is prepared by using [pandas](https://pandas.pydata.org/) DataFrames to build a predecition map.\n", @@ -95,10 +95,10 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "\n", - "### Streaming Enrichment\n", + "\n", + "### Stream Enrichment\n", "\n", - "The [**streaming-enrichment**](streaming-enrichment/Streaming-enrichment.ipynb) demo demonstrates a typical stream-based data-engineering pipeline, which is required in many real-world scenarios: data is streamed from an event streaming engine; the data is enriched, in real time, using data from a NoSQL table; the enriched data is saved to an output data stream and then consumed from this stream.\n", + "The [**stream-enrich**](stream-enrich/stream-enrich.ipynb) demo demonstrates a typical stream-based data-engineering pipeline, which is required in many real-world scenarios: data is streamed from an event streaming engine; the data is enriched, in real time, using data from a NoSQL table; the enriched data is saved to an output data stream and then consumed from this stream.\n", "\n", "- Car-owner data is streamed into the platform from a simulated streaming engine by using an event-triggered [Nuclio](https://nuclio.io/) serverless function.\n", "- The data is written (ingested) into an input platform stream by using the the platform's [Streaming Web API](https://www.iguazio.com/docs/reference/latest-release/api-reference/web-apis/streaming-web-api/).\n", diff --git a/demos/README.md b/demos/README.md index c9487669..52ea7325 100644 --- a/demos/README.md +++ b/demos/README.md @@ -8,7 +8,7 @@ - [Predictive Infrastructure Monitoring](#netops-demo) - [Image Recognition](#image-classification-demo) - [Natural Language Processing (NLP)](#nlp-demo) -- [Streaming Enrichment](#streaming-enrichment-demo) +- [Stream Enrichment](#stream-enrich-demo) ## Overview @@ -18,7 +18,7 @@ The **demos** tutorials directory contains full end-to-end use-case applications ## Smart Stock Trading -The [**stocks**](stocks/read_stocks.ipynb) demo demonstrates a smart stock-trading application: +The [**stocks**](stocks/read-stocks.ipynb) demo demonstrates a smart stock-trading application: the application reads stock-exchange data from an internet service into a time-series database (TSDB); uses Twitter to analyze the market sentiment on specific stocks, in real time; and saves the data to a platform NoSQL table that is used for generating reports and analyzing and visualization the data in a Grafana dashboard. - The stock data is read from Twitter by using the [TwythonStreamer](https://twython.readthedocs.io/en/latest/usage/streaming_api.html) Python wrapper to the Twitter Streaming API, and saved to TSDB and NoSQL tables in the platform. @@ -40,7 +40,7 @@ The goal is to identify anomalies for device metrics — such as CPU, memory ## Image Recognition -The [**image_classification**](image_classification/keras-cnn-dog-or-cat-classification.ipynb) demo demonstrates image recognition: the application builds and trains an ML model that identifies (recognizes) and classifies images. +The [**image-classification**](image-classification/keras-cnn-dog-or-cat-classification.ipynb) demo demonstrates image recognition: the application builds and trains an ML model that identifies (recognizes) and classifies images. - The data is collected by downloading images of dogs and cats from the Iguazio sample data-set AWS bucket. - The training data for the ML model is prepared by using [pandas](https://pandas.pydata.org/) DataFrames to build a predecition map. @@ -55,10 +55,10 @@ The [**nlp**](nlp/nlp-example.ipynb) demo demonstrates natural language processi - The textual data is collected and processed by using the [TextBlob](https://textblob.readthedocs.io/) Python NLP library. The processing includes spelling correction and sentiment analysis. - A serverless function that translates text to another language, which is configured in an environment variable, is generated by using the [Nuclio](https://nuclio.io/) framework. - -### Streaming Enrichment + +### Stream Enrichment -The [**streaming-enrichment**](streaming-enrichment/Streaming-enrichment.ipynb) demo demonstrates a typical stream-based data-engineering pipeline, which is required in many real-world scenarios: data is streamed from an event streaming engine; the data is enriched, in real time, using data from a NoSQL table; the enriched data is saved to an output data stream and then consumed from this stream. +The [**stream-enrich**](stream-enrich/stream-enrich.ipynb) demo demonstrates a typical stream-based data-engineering pipeline, which is required in many real-world scenarios: data is streamed from an event streaming engine; the data is enriched, in real time, using data from a NoSQL table; the enriched data is saved to an output data stream and then consumed from this stream. - Car-owner data is streamed into the platform from a simulated streaming engine by using an event-triggered [Nuclio](https://nuclio.io/) serverless function. - The data is written (ingested) into an input platform stream by using the the platform's [Streaming Web API](https://www.iguazio.com/docs/reference/latest-release/api-reference/web-apis/streaming-web-api/). diff --git a/demos/image_classification/infer.ipynb b/demos/image-classification/infer.ipynb similarity index 99% rename from demos/image_classification/infer.ipynb rename to demos/image-classification/infer.ipynb index bf6df250..93210e74 100644 --- a/demos/image_classification/infer.ipynb +++ b/demos/image-classification/infer.ipynb @@ -119,12 +119,12 @@ "name": "stdout", "output_type": "stream", "text": [ - "mounting volume path /model as ~/image_classification/cats_dogs/model\n" + "mounting volume path /model as ~/image-classification/cats_dogs/model\n" ] } ], "source": [ - "%nuclio mount /model ~/image_classification/cats_dogs/model" + "%nuclio mount /model ~/image-classification/cats_dogs/model" ] }, { @@ -374,7 +374,7 @@ " options:\n", " accessKey: ad348937-7359-48e8-8f68-8014c66f2d2c\n", " container: users\n", - " subPath: /adi/image_classification/cats_dogs/model\n", + " subPath: /iguazio/image-classification/cats_dogs/model\n", " name: fs\n", " volumeMount:\n", " mountPath: /model\n", diff --git a/demos/image_classification/keras-cnn-dog-or-cat-classification.ipynb b/demos/image-classification/keras-cnn-dog-or-cat-classification.ipynb similarity index 100% rename from demos/image_classification/keras-cnn-dog-or-cat-classification.ipynb rename to demos/image-classification/keras-cnn-dog-or-cat-classification.ipynb diff --git a/demos/netops/generator.ipynb b/demos/netops/generator.ipynb index fcc130d4..728ad098 100644 --- a/demos/netops/generator.ipynb +++ b/demos/netops/generator.ipynb @@ -570,7 +570,7 @@ "\n", "### Generate Simulated Metrics Per Device\n", "\n", - "Read a metrics schema, which describes simulated values, from **metrics_configuration.yaml**." + "Read a metrics schema, which describes simulated values, from **metrics-configuration.yaml**." ] }, { @@ -580,11 +580,11 @@ "outputs": [], "source": [ "# Load the metrics configuration from a YAML file\n", - "with open('metrics_configuration.yaml', 'r') as f:\n", - " metrics_configuration = yaml.load(f)\n", + "with open('metrics-configuration.yaml', 'r') as f:\n", + " metrics-configuration = yaml.load(f)\n", "\n", "# Create a metrics generator according to the YAML configuration that was read\n", - "met_gen = metrics_generator.Generator_df(metrics_configuration, user_hierarchy=deployment_df, initial_timestamp=time.time())\n", + "met_gen = metrics_generator.Generator_df(metrics-configuration, user_hierarchy=deployment_df, initial_timestamp=time.time())\n", "metrics = met_gen.generate_range(start_time=datetime.datetime.now(),\n", " end_time=datetime.datetime.now()+datetime.timedelta(hours=1),\n", " as_df=True,\n", diff --git a/demos/netops/grafana_demo.ipynb b/demos/netops/grafana.ipynb similarity index 100% rename from demos/netops/grafana_demo.ipynb rename to demos/netops/grafana.ipynb diff --git a/demos/netops/nuclio_infer.ipynb b/demos/netops/infer.ipynb similarity index 100% rename from demos/netops/nuclio_infer.ipynb rename to demos/netops/infer.ipynb diff --git a/demos/netops/metrics_configuration.yaml b/demos/netops/metrics-configuration.yaml similarity index 100% rename from demos/netops/metrics_configuration.yaml rename to demos/netops/metrics-configuration.yaml diff --git a/demos/stocks/gen_demo_data.ipynb b/demos/stocks/gen-demo-data.ipynb similarity index 100% rename from demos/stocks/gen_demo_data.ipynb rename to demos/stocks/gen-demo-data.ipynb diff --git a/demos/stocks/read_stocks.ipynb b/demos/stocks/read-stocks.ipynb similarity index 100% rename from demos/stocks/read_stocks.ipynb rename to demos/stocks/read-stocks.ipynb diff --git a/demos/stocks/read_tweet.ipynb b/demos/stocks/read-tweets.ipynb similarity index 98% rename from demos/stocks/read_tweet.ipynb rename to demos/stocks/read-tweets.ipynb index d41a1d57..d99d2a70 100644 --- a/demos/stocks/read_tweet.ipynb +++ b/demos/stocks/read-tweets.ipynb @@ -271,7 +271,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "%nuclio: ['deploy', '-p', 'stocks', '-c', '/User/demos/stocks/read_tweet.ipynb']\n", + "%nuclio: ['deploy', '-p', 'stocks', '-c', '/User/demos/stocks/read-tweets.ipynb']\n", "%nuclio: [nuclio.deploy] 2019-03-20 16:28:06,028 (info) Building processor image\n", "%nuclio: [nuclio.deploy] 2019-03-20 16:28:08,047 (info) Pushing image\n", "%nuclio: [nuclio.deploy] 2019-03-20 16:28:08,048 (info) Build complete\n", @@ -283,13 +283,6 @@ "source": [ "%nuclio deploy -p stocks -c" ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] } ], "metadata": { diff --git a/demos/stocks/stream_viewer.ipynb b/demos/stocks/stream-viewer.ipynb similarity index 100% rename from demos/stocks/stream_viewer.ipynb rename to demos/stocks/stream-viewer.ipynb diff --git a/demos/streaming-enrichment/Streaming-enrichment.ipynb b/demos/stream-enrich/stream-enrich.ipynb similarity index 94% rename from demos/streaming-enrichment/Streaming-enrichment.ipynb rename to demos/stream-enrich/stream-enrich.ipynb index 88cbe1da..4d0dcb47 100644 --- a/demos/streaming-enrichment/Streaming-enrichment.ipynb +++ b/demos/stream-enrich/stream-enrich.ipynb @@ -4,14 +4,15 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Streaming enrichment\n", + "# Stream Enrichment\n", "\n", - "This example demonstrates how to enrich streaming data in real time with additional features stored in Iguazio NoSQL.
\n", + "This example demonstrates how to enrich stream data, in real time, with additional features that are stored in the NoSQL data store of the Iguazio Data Science Platform (**\"the platform\"**).
\n", "In this notebook you'll learn how to create and deploy a Nuclio function which is triggered by incoming event-messages to a V3IO-Stream.
\n", "The function enriches the original event-message with data from V3IO-NoSQL table and writes the enriched message to an output V3IO-Stream.\n", "In this notebook we'll create two streams: Stream 1 for input and Stream 2 for output and in addition we'll create a NoSQL table with additional info for enrichment
\n", - "The demo demonstrates sending an event to Iguazio stream with client name, car ID and email. Then the event will be enriched by joining the stream with the relevant record in the Cars table based on the CarID with additional information such as the car's color, manufacture year, vendor and state and then stored in another stream (called Stream2)

\n", - "The streams and the table are stored under \"Users\" container --> Username --> examples --> streaming_enrichment folder" + "The demo demonstrates sending an event to Iguazio stream with client name, car ID and email. Then the event will be enriched by joining the stream with the relevant record in the Cars table based on the CarID with additional information such as the car's color, manufacture year, vendor and state and then stored in another stream (called Stream2).\n", + "\n", + "The streams and the table are stored in a **<running user>/examples/stream_enrich** directory in the \"users\" data container." ] }, { @@ -56,7 +57,7 @@ "\n", "INPUT_STREAM_NAME = 'stream1'\n", "INPUT_STREAM_SEARCH_KEY = 'CarID'\n", - "INPUT_STREAM_URL = f'http://{V3IO_API}/{CONTAINER_NAME}/{V3IO_USERNAME}/examples/streaming_enrichment/{INPUT_STREAM_NAME}/'\n", + "INPUT_STREAM_URL = f'http://{V3IO_API}/{CONTAINER_NAME}/{V3IO_USERNAME}/examples/stream-enrich/{INPUT_STREAM_NAME}/'\n", "INPUT_STREAM_PARTITIONS = [0, 1, 2]\n", "INPUT_STREAM_SEEK_TO = 'earliest'\n" ] @@ -92,7 +93,7 @@ "}\n", "\n", "for stream in [INPUT_STREAM_NAME, OUTPUT_STREAM_NAME]:\n", - " url = f'http://{V3IO_API}/{CONTAINER_NAME}/{V3IO_USERNAME}/examples/streaming_enrichment/{stream}/'\n", + " url = f'http://{V3IO_API}/{CONTAINER_NAME}/{V3IO_USERNAME}/examples/stream-enrich/{stream}/'\n", "\n", " response = requests.request(\"PUT\", url, data=payload, headers=headers)\n", "\n", @@ -130,7 +131,7 @@ } ], "source": [ - "url = f'http://{V3IO_API}/{CONTAINER_NAME}/{V3IO_USERNAME}/examples/streaming_enrichment/{TABLE_NAME}/'\n", + "url = f'http://{V3IO_API}/{CONTAINER_NAME}/{V3IO_USERNAME}/examples/stream-enrich/{TABLE_NAME}/'\n", "\n", "payloads = [{\n", " \"Key\" : {\n", @@ -297,7 +298,7 @@ " v3io_username = config['v3io_username']\n", " container_name = config['container_name']\n", " search_value = msg[config['input_stream_search_key']]\n", - " table_path_and_key = f\"{v3io_username}/examples/streaming_enrichment/{config['table_name']}/{search_value}\"\n", + " table_path_and_key = f\"{v3io_username}/examples/stream-enrich/{config['table_name']}/{search_value}\"\n", " v3io_access_key = config['v3io_access_key']\n", "\n", " url = _get_url(v3io_api, container_name, table_path_and_key)\n", @@ -317,7 +318,7 @@ " v3io_api = config['v3io_api']\n", " v3io_username = config['v3io_username']\n", " container_name = config['container_name']\n", - " output_stream_path = f\"{v3io_username}/examples/streaming_enrichment/{config['output_stream_name']}/\"\n", + " output_stream_path = f\"{v3io_username}/examples/stream-enrich/{config['output_stream_name']}/\"\n", " v3io_access_key = config['v3io_access_key']\n", "\n", " records = _items_to_records(items)\n", @@ -385,7 +386,7 @@ "source": [ "import base64\n", "\n", - "url = f'http://{V3IO_API}/{CONTAINER_NAME}/{V3IO_USERNAME}/examples/streaming_enrichment/{INPUT_STREAM_NAME}/'\n", + "url = f'http://{V3IO_API}/{CONTAINER_NAME}/{V3IO_USERNAME}/examples/stream-enrich/{INPUT_STREAM_NAME}/'\n", "\n", "msg = '{\"ClientName\": \"John Smith\", \"Email\": \"john.smith@myemailprovider.com\", \"CarID\": \"0\"}'\n", "msg_b64 = base64.b64encode(msg.encode('utf-8')).decode('utf-8')\n", @@ -444,7 +445,7 @@ "}\n", "\n", "for shard_id in INPUT_STREAM_PARTITIONS:\n", - " url = f'http://{V3IO_API}/{CONTAINER_NAME}/{V3IO_USERNAME}/examples/streaming_enrichment/{OUTPUT_STREAM_NAME}/{shard_id}'\n", + " url = f'http://{V3IO_API}/{CONTAINER_NAME}/{V3IO_USERNAME}/examples/stream-enrich/{OUTPUT_STREAM_NAME}/{shard_id}'\n", " response = requests.request(\"PUT\", url, data=payload, headers=headers)\n", "\n", " if response.status_code == 200:\n", @@ -465,7 +466,7 @@ "metadata": {}, "outputs": [], "source": [ - "!rm -r /v3io/$V3IO_HOME/examples/streaming_enrichment" + "!rm -r /v3io/$V3IO_HOME/examples/stream-enrich" ] } ], diff --git a/GettingStarted/collect-n-explore.ipynb b/getting-started/collect-n-explore.ipynb similarity index 97% rename from GettingStarted/collect-n-explore.ipynb rename to getting-started/collect-n-explore.ipynb index 3337e41f..f1ca7856 100644 --- a/GettingStarted/collect-n-explore.ipynb +++ b/getting-started/collect-n-explore.ipynb @@ -29,7 +29,7 @@ "## Overview\n", "\n", "This tutorial explains and demonstrates how to collect, ingest, and explore data with the Iguazio Data Science Platform (**\"the platform\"**).
\n", - "For an overview of the platform and how it can be used to implement a full data science workflow, see the [**Welcome**](../Welcome.ipynb) tutorial notebook.
\n", + "For an overview of the platform and how it can be used to implement a full data science workflow, see the [**welcome**](../welcome.ipynb) tutorial notebook.
\n", "For full end-to-end platform use-case application demos, see [**demos**](../demos/README.ipynb) tutorial notebooks directory." ] }, @@ -72,7 +72,7 @@ "## Collecting and Ingesting Data\n", "\n", "The platform supports various alternative methods for collecting and ingesting data into its data containers (i.e., its data store).\n", - "For more information, see the [Welcome](../Welcome.ipynb#data-collection-and-ingestion) platform tutorial Jupyter notebook\n", + "For more information, see the [**welcome**](../welcome.ipynb#data-collection-and-ingestion) platform tutorial Jupyter notebook\n", "The data collection and ingestion can be done as a one-time operation, using different platform APIs — which can be run from your preferred programming interface, such as an interactive web-based Jupyter or Zeppelin notebook — or as an ongoing ingestion stream, using Nuclio serverless functions.\n", "This section explains and demonstrates how to collect and ingest (import) data into the platform using code that's run from a Jupyter notebook." ] @@ -84,7 +84,7 @@ "\n", "### Ingesting Data From an External Database to a NoSQL Table Using V3IO Frames\n", "\n", - "For an example of how to collect data from an external database — such as MySQL, Oracle, and Postgress — and ingest (write) it into a NoSQL table in the platform, using the V3IO Frames API, see the [ReadingFromExternalDB](ReadingFromExternalDB.ipynb) getting-started tutorial." + "For an example of how to collect data from an external database — such as MySQL, Oracle, and Postgress — and ingest (write) it into a NoSQL table in the platform, using the V3IO Frames API, see the [read-external-db](read-external-db.ipynb) getting-started tutorial." ] }, { @@ -104,7 +104,7 @@ "\n", "You can use a simple [curl](https://curl.haxx.se/) command to ingest a file (object) from an external web data source, such as an Amazon S3 bucket, to the platform's distributed file system (i.e., into the platform's data store).\n", "This is demonstrated in the following code example and in the [getting-started example](#getting-started-example) in this notebook.\n", - "The [SparkSQLAnalytics](SparkSQLAnalytics.ipynb) getting-started tutorial notebook demonstrates a similar ingestion using [Botocore](https://github.com/boto/botocore).\n", + "The [spark-sql-analytics](spark-sql-analytics.ipynb) getting-started tutorial notebook demonstrates a similar ingestion using [Botocore](https://github.com/boto/botocore).\n", "\n", "The example in the following cells uses curl to read a CSV file from the [Iguazio sample data-sets](http://iguazio-sample-data.s3.amazonaws.com/) public Amazon S3 bucket and save it to an **examples** directory in the running-user directory of the predefined \"users\" data container (`/v3io/users/$V3IO_USERNAME` = `v3io/$V3IO_HOME` = `/User`)." ] @@ -172,7 +172,7 @@ "\n", "After you have ingested data into the platform's data containers, you can use various alternative methods and tools to explore and analyze the data.\n", "Data scientists typically use Jupyter Notebook to run the exploration phase.\n", - "As outlined in the [Welcome](../Welcome.ipynb#data-exploration-and-processing) tutorial notebook, the platform's Jupyter Notebook service has a wide range of pre-deployed popular data science tools (such as Spark and Presto) and allows installation of additional tools and packages, enabling you to use different APIs to access the same data from a single Jupyter notebook.\n", + "As outlined in the [**welcome**](../welcome.ipynb#data-exploration-and-processing) tutorial notebook, the platform's Jupyter Notebook service has a wide range of pre-deployed popular data science tools (such as Spark and Presto) and allows installation of additional tools and packages, enabling you to use different APIs to access the same data from a single Jupyter notebook.\n", "This section explains and demonstrates how to explore data in the platform from a Jupyter notebook." ] }, @@ -185,7 +185,7 @@ "\n", "Spark is a distributed computing framework for data analytics.\n", "You can easily run distributed Spark jobs on you platform cluster that use Spark DataFrames to access data files (objects), tables, or streams in the platform's data store.\n", - "For more information and examples, see the [SparkSQLAnalytics](SparkSQLAnalytics.ipynb) getting-started tutorial notebook." + "For more information and examples, see the [spark-sql-analytics](spark-sql-analytics.ipynb) getting-started tutorial notebook." ] }, { diff --git a/GettingStarted/FilesAccess.ipynb b/getting-started/file-access.ipynb similarity index 100% rename from GettingStarted/FilesAccess.ipynb rename to getting-started/file-access.ipynb diff --git a/GettingStarted/frames.ipynb b/getting-started/frames.ipynb similarity index 100% rename from GettingStarted/frames.ipynb rename to getting-started/frames.ipynb diff --git a/GettingStarted/ReadWriteFromParquet.ipynb b/getting-started/parquet-read-write.ipynb similarity index 100% rename from GettingStarted/ReadWriteFromParquet.ipynb rename to getting-started/parquet-read-write.ipynb diff --git a/GettingStarted/ReadingFromExternalDB.ipynb b/getting-started/read-external-db.ipynb similarity index 100% rename from GettingStarted/ReadingFromExternalDB.ipynb rename to getting-started/read-external-db.ipynb diff --git a/GettingStarted/SparkSQLAnalytics.ipynb b/getting-started/spark-sql-analytics.ipynb similarity index 99% rename from GettingStarted/SparkSQLAnalytics.ipynb rename to getting-started/spark-sql-analytics.ipynb index 574442da..1f0ba5f5 100644 --- a/GettingStarted/SparkSQLAnalytics.ipynb +++ b/getting-started/spark-sql-analytics.ipynb @@ -533,7 +533,7 @@ "2. Use Spark JDBC to read table from AWS Redshift\n", "\n", "\n", - "For more details read [Reading From External Databases](ReadingFromExternalDBs.ipynb) and [Spark JDBC to Databases](SparkJDBCtoDBs.ipynb)" + "For more details read [read-external-db](read-external-db.ipynb) and [Spark JDBC to Databases](SparkJDBCtoDBs.ipynb)" ] }, { diff --git a/GettingStarted/Conda_virtual_env.ipynb b/getting-started/virutal-env.ipynb similarity index 100% rename from GettingStarted/Conda_virtual_env.ipynb rename to getting-started/virutal-env.ipynb diff --git a/Welcome.ipynb b/welcome.ipynb similarity index 88% rename from Welcome.ipynb rename to welcome.ipynb index 4ef525eb..00f5b730 100644 --- a/Welcome.ipynb +++ b/welcome.ipynb @@ -19,11 +19,11 @@ " - [Deploying Models to Production](#deploying-models-to-production)\n", " - [Visualization, Monitoring, and Logging](#visualization-monitoring-and-logging)\n", "- [End-to-End Use-Case Applications](#end-to-end-use-case-applications)\n", - " - [Smart Stock Trading](demos/stocks/read_stocks.ipynb)\n", + " - [Smart Stock Trading](demos/stocks/read-stocks.ipynb)\n", " - [Predictive Infrastructure Monitoring](demos/netops/generator.ipynb)\n", - " - [Image Recognition](demos/image_classification/keras-cnn-dog-or-cat-classification.ipynb)\n", + " - [Image Recognition](demos/image-classification/keras-cnn-dog-or-cat-classification.ipynb)\n", " - [Natural Language Processing (NLP)](demos/nlp/nlp-example.ipynb)\n", - " - [Streaming Enrichment](demos/streaming-enrichment/Streaming-enrichment.ipynb)\n", + " - [Stream Enrichment](demos/stream-enrich/stream-enrich.ipynb)\n", "- [Jupyter Notebook Basics](#jupyter-notebook-basics)\n", " - [Creating Virtual Environments in Jupyter Notebook](#creating-virtual-environments-in-jupyter-notebook)\n", "- [Additional Resources](#additional-resources)\n", @@ -68,7 +68,7 @@ "\n", "A good place to start your development is with the platform [tutorial Jupyter notebooks](https://github.com/v3io/tutorials).\n", "\n", - "- The [**GettingStarted**](GettingStarted/collect-n-explore.ipynb) directory contains information and code examples to help you quickly get started using the platform.\n", + "- The [**getting-started**](getting-started/collect-n-explore.ipynb) directory contains information and code examples to help you quickly get started using the platform.\n", "- The [**demos**](demos/README.ipynb) directory contains full end-to-end use-case application demos." ] }, @@ -100,16 +100,16 @@ "\n", "- Streaming data in real time from sources such as Kafka, Kinesis, Azure Event Hubs, or Google Pub/Sub.\n", "- Loading data directly from external databases using an event-driven or periodic/scheduled implementation.\n", - " See the explanation and examples in the [**ReadingFromExternalDB**](GettingStarted/ReadingFromExternalDB.ipynb) tutorial.\n", + " See the explanation and examples in the [**read-external-db**](getting-started/read-external-db.ipynb#ingest-from-external-db-to-no-sql-using-frames) tutorial.\n", "- Loading files (objects), in any format (for example, CSV, Parquet, JSON, or a binary image), from internal or external sources such as Amazon S3 or Hadoop.\n", - " See, for example, the [**FilesAccess**](GettingStarted/FilesAccess.ipynb) tutorial.\n", + " See, for example, the [**file-access**](getting-started/file-access.ipynb) tutorial.\n", "- Importing time-series telemetry data using a Prometheus compatible scraping API.\n", "- Ingesting (writing) data directly into the system using RESTful AWS-like simple-object, streaming, or NoSQL APIs.\n", " See the platform's [Web-API References](https://www.iguazio.com/docs/reference/latest-release/api-reference/web-apis/).\n", "- Scraping or reading data from external sources — such as Twitter, weather services, or stock-trading data services — using serverless functions.\n", - " See, for example, the [**stocks**](demos/stocks/read_stocks.ipynb) demo use-case application.\n", + " See, for example, the [**stocks**](demos/stocks/read-stocks.ipynb) demo use-case application.\n", "\n", - "For more information and examples of data collection and ingestion wcollect-n-exploreith the platform, see the [**collect-n-explore**](GettingStarted/collect-n-explore.ipynb#gs-data-collection-and-ingestion) tutorial Jupyter notebook." + "For more information and examples of data collection and ingestion wcollect-n-exploreith the platform, see the [**collect-n-explore**](getting-started/collect-n-explore.ipynb#gs-data-collection-and-ingestion) tutorial Jupyter notebook." ] }, { @@ -122,13 +122,13 @@ "The platform includes a wide range of integrated open-source data query and exploration tools, including the following:\n", "\n", "- [Apache Spark](https://spark.apache.org/) data-processing engine — including the Spark SQL and Datasets, MLlib, R, and GraphX libraries — with real-time access to the platform's NoSQL data store and file system.\n", - " See the platform's [Spark APIs reference](https://www.iguazio.com/docs/reference/latest-release/api-reference/spark-apis/) and the examples in the [**SparkSQLAnalytics**](GettingStarted/SparkSQLAnalytics.ipynb) tutorial.\n", + " See the platform's [Spark APIs reference](https://www.iguazio.com/docs/reference/latest-release/api-reference/spark-apis/) and the examples in the [**spark-sql-analytics**](getting-started/spark-sql-analytics.ipynb) tutorial.\n", "- [Presto](http://prestodb.github.io/) distributed SQL query engine, which can be used to run interactive SQL queries over platform NoSQL tables or other object (file) data sources.\n", " See the platform's [Presto reference](https://www.iguazio.com/docs/reference/latest-release/presto/).\n", "- [pandas](https://pandas.pydata.org/) Python analysis library, including structured DataFrames.\n", "- [Dask](https://dask.org/) parallel-computing Python library, including scaled pandas DataFrames.\n", "- [V3IO Frames](https://github.com/v3io/frames) — Iguazio's open-source data-access library, which provides a unified high-performance API for accessing NoSQL, stream, and time-series data in the platform's data store and features native integration with pandas and [NVIDIA RAPIDS](https://rapids.ai/).\n", - " See, for example, the [**frames**](GettingStarted/frames.ipynb) tutorial.\n", + " See, for example, the [**frames**](getting-started/frames.ipynb) tutorial.\n", "- Built-in support for ML packages such as [scikit-learn](https://scikit-learn.org), [Pyplot](https://matplotlib.org/api/_as_gen/matplotlib.pyplot.html), [NumPy](http://www.numpy.org/), [PyTorch](https://pytorch.org/), and [TensorFlow](https://www.tensorflow.org/).\n", "\n", "All these tools are integrated with the platform's Jupyter Notebook service, allowing users to access the same data from Jupyter through different interfaces with minimal configuration overhead.\n", @@ -137,7 +137,7 @@ "\n", "> **Note:** You can deploy and manage application services, such as Spark and Jupyter Notebook, from the **Services** page of the platform dashboard.\n", "\n", - "For more information and examples of data exploration with the platform, see the [**collect-n-explore**](GettingStarted/collect-n-explore.ipynb#gs-data-exploration-and-processing) tutorial Jupyter notebook." + "For more information and examples of data exploration with the platform, see the [**collect-n-explore**](getting-started/collect-n-explore.ipynb#gs-data-exploration-and-processing) tutorial Jupyter notebook." ] }, { @@ -152,7 +152,7 @@ "You can find model-training examples in the platform's tutorial Jupyter notebooks:\n", "\n", "- The [NetOps demo](demos/netops/training.ipynb) tutorial demonstrates predictive infrastructure-monitoring using scikit-learn.\n", - "- The [image-classification demo](demos/image_classification/infer.ipynb) tutorial demonstrates image recognition using TensorFlow and Keras.\n", + "- The [image-classification demo](demos/image-classification/infer.ipynb) tutorial demonstrates image recognition using TensorFlow and Keras.\n", "\n", "If you're are a beginner, you might find the following ML guide useful — [Machine Learning Algorithms In Layman's Terms](https://towardsdatascience.com/machine-learning-algorithms-in-laymans-terms-part-1-d0368d769a7b)." ] @@ -176,7 +176,7 @@ "> **Note:** Nuclio functions aren't limited to model serving: they can automate data collection, serve custom APIs, build real-time feature vectors, drive triggers, and more.\n", "\n", "For an overview of Nuclio and how to develop, document, and deploy serverless Python Nuclio functions from Jupyter Notebook, see the [nuclio-jupyter documentation](https://github.com/nuclio/nuclio-jupyter/blob/master/README.md).\n", - "You can also find examples in the platform tutorial Jupyter notebooks; for example, the [NetOps demo](demos/netops/nuclio_infer.ipynb) tutorial demonstrates how to deploy a network-operations model as a function." + "You can also find examples in the platform tutorial Jupyter notebooks; for example, the [NetOps demo](demos/netops/infer.ipynb) tutorial demonstrates how to deploy a network-operations model as a function." ] }, { @@ -208,11 +208,11 @@ "Iguazio provides full end-to-end use-case applications that demonstrate how to use the Iguazio Data Science Platform and related tools to address data science requirements for different industries and implementations.\n", "The applications are provided in the **demos** directory of the platform's tutorial Jupyter notebooks and cover the following use cases; for more detailed descriptions, see the demos README ([notebook](demos/README.ipynb) / [Markdown](demos/README.md)):\n", "\n", - "- **Smart stock trading** ([**stocks**](demos/stocks/read_stocks.ipynb)) — the application reads stock-exchange data from an internet service into a time-series database (TSDB); uses Twitter to analyze the market sentiment on specific stocks, in real time; and saves the data to a platform NoSQL table that is used for generating reports and analyzing and visualization the data in a Grafana dashboard.\n", + "- **Smart stock trading** ([**stocks**](demos/stocks/read-stocks.ipynb)) — the application reads stock-exchange data from an internet service into a time-series database (TSDB); uses Twitter to analyze the market sentiment on specific stocks, in real time; and saves the data to a platform NoSQL table that is used for generating reports and analyzing and visualization the data in a Grafana dashboard.\n", "- **Predictive infrastructure monitoring** ([**netops**](demos/netops/generator.ipynb)) — the application builds, trains, and deploys a machine-learning model for analyzing and predicting failure in network devices as part of a network operations (NetOps) flow. The goal is to identify anomalies for device metrics — such as CPU, memory consumption, or temperature — which can signify an upcoming issue or failure.\n", - "- **Image recognition** ([**image_classification**](demos/image_classification/keras-cnn-dog-or-cat-classification.ipynb)) — the application builds and trains an ML model that identifies (recognizes) and classifies images by using Keras, TensorFlow, and scikit-learn.\n", + "- **Image recognition** ([**image-classification**](demos/image-classification/keras-cnn-dog-or-cat-classification.ipynb)) — the application builds and trains an ML model that identifies (recognizes) and classifies images by using Keras, TensorFlow, and scikit-learn.\n", "- **Natural language processing (NLP)** ([**nlp**](demos/nlp/nlp-example.ipynb)) — the application processes natural-language textual data — including spelling correction and sentiment analysis — and generates a Nuclio serverless function that translates any given text string to another (configurable) language.\n", - "- **Streaming enrichment** ([**streaming-enrichment**](demos/streaming-enrichment/Streaming-enrichment.ipynb)) — the application demonstrates a typical stream-based data-engineering pipeline, which is required in many real-world scenarios: data is streamed from an event streaming engine; the data is enriched, in real time, using data from a NoSQL table; the enriched data is saved to an output data stream and then consumed from this stream." + "- **Stream enrichment** ([**stream-enrich**](demos/stream-enrich/stream-enrich.ipynb)) — the application demonstrates a typical stream-based data-engineering pipeline, which is required in many real-world scenarios: data is streamed from an event streaming engine; the data is enriched, in real time, using data from a NoSQL table; the enriched data is saved to an output data stream and then consumed from this stream." ] }, { @@ -238,11 +238,11 @@ "- The contents of the running-user home directory — **users/<running user>**.\n", " This directory contains the platform's [tutorial Jupyter notebooks](https://github.com/v3io/tutorials):\n", "\n", - " - [**Welcome.ipynb**](../Welcome.ipynb) — a documentation notebook that provides a short introduction to the platform and how to use it to implement a full data science workflow.\n", - " - **GettingStarted** — a directory containing getting-started tutorials that explain and demonstrate how to perform basic platform operations — such as data collection, ingestion, and analysis — as detailed in the current notebook.\n", - " - **demos** — a directory containing [end-to-end application use-case demos](../demos/README.ipynb).\n", + " - [**welcome.ipynb**](../welcome.ipynb) / [**README.md**](../README.md) — the current document, which provides a short introduction to the platform and how to use it to implement a full data science workflow.\n", + " - **getting-started** — a directory containing getting-started tutorials that explain and demonstrate how to perform different platform operations using the platform APIs and integrated tools.\n", + " - **demos** — a directory containing [end-to-end application use-case demos](#end-to-end-use-case-applications).\n", "\n", - "For information about the predefined data containers and how to reference data in these containers, see [Platform Data Containers](GettingStarted/collect-n-explore.ipynb/#platform-data-containers) in the **collect-n-explore** tutorial notebook." + "For information about the predefined data containers and how to reference data in these containers, see [Platform Data Containers](getting-started/collect-n-explore.ipynb/#platform-data-containers) in the **collect-n-explore** tutorial notebook." ] }, { @@ -254,7 +254,7 @@ "\n", "A virtual environment is a named, isolated, working copy of Python that maintains its own files, directories, and paths so that you can work with specific versions of libraries or Python itself without affecting other Python projects.\n", "Virtual environments make it easy to cleanly separate projects and avoid problems with different dependencies and version requirements across components.\n", - "See the [CondaVirtualEnv](GettingStarted/CondaVirtualEnv.ipynb) tutorial notebook for step-by-step instructions for using conda to create your own Python virtual environments, which will appear as custom kernels in Jupyter Notebook." + "See the [virutal-env](getting-started/virutal-env.ipynb) tutorial notebook for step-by-step instructions for using conda to create your own Python virtual environments, which will appear as custom kernels in Jupyter Notebook." ] }, {