-
Notifications
You must be signed in to change notification settings - Fork 1
connector tw configurations
Under the folder sda\confs\connector-tw you will find 3 configuration files:
the properties for log4j. Set where you want the connector log. Edit this file following your needs.
configuration file for hibernate. Edit it if you compiled the GE with the DAO default implementation. If you provide a different implementation you can leave this file as is or delete it. Edit the following fields with your database configuration:
<property name="connection.url"></property>
<property name="connection.username"> </property>
<property name="connection.password"> </property>
You can find the model of the default DAO in social-data-aggregator/data_model in the project directory.
In this section of the configuration file there are all the properties regarding the connection with Twitter:
Key Name | Optional | Description |
---|---|---|
twConsumerKey | NO | Consumer Key of the twitter application |
twConsumerSecret | NO | Consumer Secret of the twitter application |
twToken | NO | User token |
twTokenSecret | NO | User token secret |
In this section of the configuration file there are the configurations regarding the node that hosts the driver:
Key Name | Optional | Description |
---|---|---|
nodeName | NO | The name of the node (the value must be the same of the field monitoring_from_node in the db model in case you use the default DAO). This property is needed in case of multiple instances of the collector in nodes that have different Public IPs but share the same rdbms. In this way you can choose which key will be monitored from a target node. |
proxyPort | YES | (Uncomment this property in case you use a proxy for outbound connections) The proxy port |
proxyHost | YES | (Uncomment this property in case you use a proxy for outbound connections) The proxy host |
In this section of the configuration file there are the configurations regarding the spark Streaming Context:
Key Name | Optional | Description |
---|---|---|
numMaxCore | YES | Number of cores to associate to this application (in case you have to run multiple streaming application) If you run just the collector you can comment this property |
checkpointDir | NO | Directory where spark will save this application checkpoints |
sparkBatchDurationMillis | NO | Duration of the batch (in milliseconds). It is the basic interval at which the system with receive the data in batches |
sparkCleanTTL | NO | Duration (seconds) of how long Spark will remember any metadata (stages generated, tasks generated, etc.). Periodic cleanups will ensure that metadata older than this duration will be forgotten. |
twitterInserterWindowDuration | NO | Duration of the window. Both the window duration and the slide duration must be multiples of the batch interval. Save frequency for gathered data. |
twitterInserterWindowSlidingInterval | NO | Window sliding interval. The interval at which the window will slide or move forward. (set equal to the twitterInserterWindowDuration to avoid duplicated data saved) |
In this section of the configuration file there are the configurations regarding the app:
Key Name | Optional | Description |
---|---|---|
serverPort | NO | The port on which jetty server will listen. Needed to start,restart,stop the collector. |
savePartitions | NO | Number of partition to coalesce before save. Equals one will generate one file containing raw tweets for window. |
dataOutputFolder | NO | the folder where the raw data will be saved |
dataRootFolder | NO | Root folder on which data will be saved. Example: dataOutputFolder=file://tmp/data and dataRootFolder=raw will save data on file://tmp/data/raw/... |
daoClass | YES | class for the custom dao if you don't want to use the default one |
In this section of the configuration file there are the configurations regarding the kafka. If you don’t want the data sent on kafka delete or comment the following properties:
Key Name | Optional | Description |
---|---|---|
brokersList | NO | Kafka brokers list (separated by ,) |
kafkaSerializationClass | NO | Default kafka.serializer.StringEncoder Change it if you want another serializer. |
kafkaRequiredAcks | NO | tells Kafka the number of acks you want your Producer to require from the Broker that the message was received. |
maxTotalConnections | NO | number of total connections for the connection pool |
maxIdleConnections | NO | number of idle connections for the connection pool |
customProducerFactoryImpl | YES | uncomment if needed other implementation for connection to bus different than kafka |