Deep distributed sentiment analysis on data stream (example)

Text stream sentiment analysis using a distributed deep learning approach based on Convolutional Neural Networks.

Language: Python 2.7

Streaming platform: Kafka

Distributed deep learning: Spark + BigDL

Authors

Mario Ruggieri

e-mail: [email protected]

Dependencies

Intel BigDL with Spark: https://bigdl-project.github.io/master/#PythonUserGuide/install-from-pip/
Kafka 0.8.2.2: https://kafka.apache.org/downloads

Usage

Starting Kafka ecosystem (bin is in kafka_2.11-0.8.2.2):

./bin/zookeeper-server-start.sh config/zookeeper.properties ./bin/kafka-server-start.sh config/server.properties

Starting producers:

python [HERE PATH TO kafka_producer.py] python [HERE PATH TO kafka_producer_for_word_prediction.py]

Stream testing with pretrained models:

${SPARK_HOME}/bin/spark-submit  \
--py-files ${PYTHON_API_ZIP_PATH},[HERE ABSOLUTE PATH TO cnn_stream_classifier.py]  \
--jars ${BigDL_JAR_PATH}  \
--conf spark.driver.extraClassPath=${BigDL_JAR_PATH} \
--conf spark.executor.extraClassPath=bigdl-0.2.0-SNAPSHOT-jar-with-dependencies.jar  \
--conf spark.executorEnv.PYTHONHASHSEED=${PYTHONHASHSEED} \
--packages org.apache.spark:spark-streaming-kafka-0-8_2.11:2.1.1 \
--driver-memory 10g  \
--executor-cores 4 \
--executor-memory 60g \
--num-executors 1 \
[HERE RELATIVE PATH TO cnn_stream_classifier.py] \
--action streaming_test \
--modelPath [HERE PATH TO model_for_sentiment]

${SPARK_HOME}/bin/spark-submit  \
--py-files ${PYTHON_API_ZIP_PATH},[HERE ABSOLUTE PATH TO lstm_word_prediction_single_out.py]  \
--jars ${BigDL_JAR_PATH}  \
--conf spark.driver.extraClassPath=${BigDL_JAR_PATH} \
--conf spark.executor.extraClassPath=bigdl-0.2.0-SNAPSHOT-jar-with-dependencies.jar  \
--conf spark.executorEnv.PYTHONHASHSEED=${PYTHONHASHSEED} \
--packages org.apache.spark:spark-streaming-kafka-0-8_2.11:2.1.1 \
--driver-memory 10g  \
--executor-cores 4 \
--executor-memory 60g \
--num-executors 1 \
[HERE RELATIVE PATH TO lstm_word_prediction_single_out.py]  \
--action streaming_test \
--modelPath [HERE PATH TO model_for_word_pred]

Training (generating models):

${SPARK_HOME}/bin/spark-submit  \
  --py-files ${PYTHON_API_ZIP_PATH},[HERE ABSOLUTE PATH TO cnn_stream_classifier.py] \
  --jars ${BigDL_JAR_PATH}  \
  --conf spark.driver.extraClassPath=${BigDL_JAR_PATH} \
  --conf spark.executor.extraClassPath=bigdl-0.2.0-SNAPSHOT-jar-with-dependencies.jar  \
  --conf spark.executorEnv.PYTHONHASHSEED=${PYTHONHASHSEED} \
  --driver-memory 10g  \
  --executor-cores 4 \
  --executor-memory 60g \
  --num-executors 1 \
  [HERE RELATIVE PATH TO cnn_stream_classifier.py] \
  --checkpoint_path [HERE PATH TO THE CHECKPOINT PATH] \
  --log_path [HERE PATH TO THE LOG PATH]

${SPARK_HOME}/bin/spark-submit  \
--py-files ${PYTHON_API_ZIP_PATH},[HERE ABSOLUTE PATH TO lstm_word_prediction_single_out.p] \
--jars ${BigDL_JAR_PATH}  \
--conf spark.driver.extraClassPath=${BigDL_JAR_PATH} \
--conf spark.executor.extraClassPath=bigdl-0.2.0-SNAPSHOT-jar-with-dependencies.jar  \
--conf spark.executorEnv.PYTHONHASHSEED=${PYTHONHASHSEED} \
--driver-memory 10g  \
--executor-cores 4 \
--executor-memory 60g \
--num-executors 1 \
[HERE RELATIVE PATH TO lstm_word_prediction_single_out.p] \
--checkpoint_path [HERE PATH TO THE CHECKPOINT PATH]
--log_path [HERE PATH TO THE LOG PATH]

License

Please read Apache 2.0 License file

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
LICENSE		LICENSE
README.md		README.md
cnn_stream_classifier.py		cnn_stream_classifier.py
kafka_producer.py		kafka_producer.py
test.txt		test.txt
train.txt		train.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep distributed sentiment analysis on data stream (example)

Authors

Dependencies

Usage

License

About

Releases

Packages

Languages

License

MarioRuggieri/Deep-sentiment-analysis-on-data-stream-example

Folders and files

Latest commit

History

Repository files navigation

Deep distributed sentiment analysis on data stream (example)

Authors

Dependencies

Usage

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages