This blog post describes how to configure WSO2 DAS Fully Distributed Setup. (Cluster setup)
WSO2 Data Analytics Server 3.0.0 combines real-time, batch, interactive, and predictive (via machine learning) analysis of data into one integrated platform to support the multiple demands of Internet of Things (IoT) solutions, as well as mobile and Web apps. For more info
The following diagram describes the fully-distributed deployment pattern. This pattern is used as high availability deployment .
Prerequisite:
- Download and extract the wso2 DAS 300 (7 nodes)
- Apache HBase and Apache HDFS cluster
- MySQL setup
- SVN server to use as the deployment synchronizer.
If you are not interested in using Apache HBase and Apache HDFS as Data Store you can use RDBMS. In this blog post I'm only focusing on Apache HBase and Apache HDFS as DAS Data Store.
Please note that for each node I only use one DAS pack,it means I use 7 DAS packs in 7 node , so no offset is needed.
You need to off set each node if you need to setup a DAS cluster in a single machine or want to host multiple nodes in a single machine . To avoid port conflicts change the the following property in carbon .xml.
<DAS_HOME>/repository/conf/carbon.xml <Offset>0</Offset>
Database configuration
We are using mysql for all carbon related databases, analytics processed record store and metrics db.
1. Create all necessary databases.
create database dasreceiver1; // DAS receiver Node1 local database create database dasreceiver2; // DAS receiver Node2 local database create database dasanalyzer1; // DAS analyzer Node1 local database create database dasanalyzer2; // DAS analyzer Node2 local database create database dasindexer1; // DAS indexer Node1 local database create database dasindexer2; // DAS indexer Node2 local database create database dasdashboard; // DAS dashboard Node local database create database regdb; // Registry DB that used to mount to all DAS nodes create database userdb; //User DB, that will be shared with all DAS and Gregreate create database metrics; // Metrics DB, that will be used for WSO2 Carbon Metrics create database analytics_processed_data_store; // This will use for store analytics processed record
2. In each node (all 7 nodes), add the following database configuration.
Open master-datasources.xml
/repository/conf/datasources/master-datasources.xmlDAS_HOME
Please note that when changing WSO2_CARBON_DB use relevant DB for relevant node .(For eg; receiver1 node use dasreceiver1 db as WSO2_CARBON_DB )
<datasources-configuration xmlns:svns="http://org.wso2.securevault/configuration"> <providers> <provider>org.wso2.carbon.ndatasource.rdbms.RDBMSDataSourceReader</provider> </providers> <datasources> <datasource> <name>WSO2_CARBON_DB</name> <description>The datasource used for registry and user manager</description> <jndiConfig> <name>jdbc/WSO2CarbonDB</name> </jndiConfig> <definition type="RDBMS"> <configuration> <url>jdbc:mysql://10.100.7.53:3306/dasreceiver1</url> <username>root</username> <password>root</password> <driverClassName>com.mysql.jdbc.Driver</driverClassName> <maxActive>50</maxActive> <maxWait>60000</maxWait> <testOnBorrow>true</testOnBorrow> <validationQuery>SELECT 1</validationQuery> <validationInterval>30000</validationInterval> <defaultAutoCommit>false</defaultAutoCommit> </configuration> </definition> </datasource> <datasource> <name>WSO2_DAS_UM</name> <description>The datasource used for registry and user manager</description> <jndiConfig> <name>jdbc/WSO2_DAS_UM</name> </jndiConfig> <definition type="RDBMS"> <configuration> <url>jdbc:mysql://10.100.7.53:3306/userdb</url> <username>root</username> <password>root</password> <driverClassName>com.mysql.jdbc.Driver</driverClassName> <maxActive>50</maxActive> <maxWait>60000</maxWait> <testOnBorrow>true</testOnBorrow> <validationQuery>SELECT 1</validationQuery> <validationInterval>30000</validationInterval> <defaultAutoCommit>false</defaultAutoCommit> </configuration> </definition> </datasource> <datasource> <name>WSO2_DAS_REG</name> <description>The datasource used for registry and user manager</description> <jndiConfig> <name>jdbc/WSO2_DAS_REG</name> </jndiConfig> <definition type="RDBMS"> <configuration> <url>jdbc:mysql://10.100.7.53:3306/regdb</url> <username>root</username> <password>root</password> <driverClassName>com.mysql.jdbc.Driver</driverClassName> <maxActive>50</maxActive> <maxWait>60000</maxWait> <testOnBorrow>true</testOnBorrow> <validationQuery>SELECT 1</validationQuery> <validationInterval>30000</validationInterval> <defaultAutoCommit>false</defaultAutoCommit> </configuration> </definition> </datasource> </datasources> </datasources-configuration>
- Open metrics-datasources.xml and add the below configurations.
/repository/conf/datasources/metrics-datasources.xmlDAS_HOME
<datasources-configuration xmlns:svns="http://org.wso2.securevault/configuration"> <providers> <provider>org.wso2.carbon.ndatasource.rdbms.RDBMSDataSourceReader</provider> </providers> <datasources> <!-- MySQL --> <datasource> <name>WSO2_METRICS_DB</name> <jndiConfig> <name>jdbc/WSO2MetricsDB</name> </jndiConfig> <definition type="RDBMS"> <configuration> <url>jdbc:mysql://10.100.7.53:3306/metrics</url> <username>root</username> <password>root</password> <driverClassName>com.mysql.jdbc.Driver</driverClassName> <maxActive>60</maxActive> <maxWait>60000</maxWait> <testOnBorrow>true</testOnBorrow> <validationQuery>SELECT 1</validationQuery> <validationInterval>30000</validationInterval> <defaultAutoCommit>false</defaultAutoCommit> </configuration> </definition> </datasource> </datasources> </datasources-configuration>
- Open analytics-datasources.xml and add the below configurations.
/repository/conf/datasources/analytics-datasources.xmlDAS_HOME
<providers> <provider>org.wso2.carbon.ndatasource.rdbms.RDBMSDataSourceReader</provider> <provider>org.wso2.carbon.datasource.reader.hadoop.HDFSDataSourceReader</provider> <provider>org.wso2.carbon.datasource.reader.hadoop.HBaseDataSourceReader</provider> <!--<provider>org.wso2.carbon.datasource.reader.cassandra.CassandraDataSourceReader</provider>--> </providers>
<datasource> <name>WSO2_ANALYTICS_RS_DB_HBASE</name> <description>The datasource used for analytics file system</description> <jndiConfig> <name>jdbc/WSO2HBaseDB</name> </jndiConfig> <definition type="HBASE"> <configuration> <property> <name>hbase.master</name> <value>das300-hdfs-master:60000</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>das300-hdfs-master,das300-hdfs-slave1,das300-hdfs-slave2</value> </property> <property> <name>fs.hdfs.impl</name> <value>org.apache.hadoop.hdfs.DistributedFileSystem</value> </property> <property> <name>fs.file.impl</name> <value>org.apache.hadoop.fs.LocalFileSystem</value> </property> </configuration> </definition> </datasource>
<datasource> <name>WSO2_ANALYTICS_FS_DB_HDFS</name> <description>The datasource used for analytics file system</description> <jndiConfig> <name>jdbc/WSO2HDFSDB</name> </jndiConfig> <definition type="HDFS"> <configuration> <property> <name>fs.default.name</name> <value>hdfs://das300-hdfs-master:9000</value> </property> <property> <name>dfs.data.dir</name> <value>/dfs/data</value> </property> <property> <name>fs.hdfs.impl</name> <value>org.apache.hadoop.hdfs.DistributedFileSystem</value> </property> <property> <name>fs.file.impl</name> <value>org.apache.hadoop.fs.LocalFileSystem</value> </property> </configuration> </definition> </datasource>
<datasource> <name>WSO2_ANALYTICS_PROCESSED_DATA_STORE_DB</name> <description>The datasource used for analytics record store</description> <definition type="RDBMS"> <configuration> <url>jdbc:mysql://10.100.7.53:3306/analytics_processed_data_store</url> <username>root</username> <password>root</password> <driverClassName>com.mysql.jdbc.Driver</driverClassName> <maxActive>50</maxActive> <maxWait>60000</maxWait> <testOnBorrow>true</testOnBorrow> <validationQuery>SELECT 1</validationQuery> <validationInterval>30000</validationInterval> <defaultAutoCommit>false</defaultAutoCommit> </configuration> </definition> </datasource>
- In each node ,you have mapped habse/hdfs host names and ips in host file.
192.168.48.167 das300-hdfs-master 192.168.48.172 das300-hdfs-slave2 192.168.48.168 das300-hdfs-slave1
1. Open carbon.xml file and do the below configurations.
<DAS_HOME>/repository/conf/carbon.xml
-
Add the necessary host names
<HostName>das.qa.wso2.receiver1</HostName> <MgtHostName>mgt.das.qa.wso2.receiver</MgtHostName>
- Below changes must do in order to set the Deployment synchronization . Note that AutoCommit option is set to true in the only in one receiver node .
<DeploymentSynchronizer> <Enabled>true</Enabled> <AutoCommit>false</AutoCommit> <AutoCheckout>true</AutoCheckout> <RepositoryType>svn</RepositoryType> <SvnUrl>http://xxx.xx.x/svn/das300rc_repo</SvnUrl> <SvnUser>xxx</SvnUser> <SvnPassword>xxx</SvnPassword> <SvnUrlAppendTenantId>true</SvnUrlAppendTenantId> </DeploymentSynchronizer>
/repository/conf/axis2/axis2.xmlDAS_HOME
- Enable the hazelcast clustering
<clustering class="org.wso2.carbon.core.clustering.hazelcast.HazelcastClusteringAgent" enable="true">
-
Change the membershipScheme to wka
<parameter name="membershipScheme">wka</parameter>
- Change the domain, all cluster nodes will join to same domain.
<parameter name="domain">wso2.qa.das.domain</parameter>
- Change the local member host by adding the IP of the each node with port.
<parameter name="localMemberHost">192.168.48.205</parameter> <parameter name="localMemberPort">4000</parameter>
-
Add all other well known members with ports. (other 6 nodes IP and port)
<members> <member> <hostName>192.168.48.21</hostName> <port>4000</port> </member> <member> <hostName>192.168.48.22</hostName> <port>4000</port> </member> <member> <hostName>192.168.48.23</hostName> <port>4000</port> </member> <member> <hostName>192.168.48.24</hostName> <port>4000</port> </member> <member> <hostName>192.168.48.25</hostName> <port>4000</port> </member> </members>
<DAS_HOME>/repository/conf/registry.xml
<wso2registry> <currentDBConfig>wso2registry</currentDBConfig> <readOnly>false</readOnly> <enableCache>true</enableCache> <registryRoot>/</registryRoot> <dbConfig name="wso2registry"> <dataSource>jdbc/WSO2CarbonDB</dataSource> </dbConfig> <dbConfig name="govregistry"> <dataSource>jdbc/WSO2_DAS_REG</dataSource> </dbConfig> <remoteInstance url="https://localhost"> <id>gov</id> <cacheId>root@jdbc:mysql://10.100.7.53:3306:3306/regdb</cacheId> <dbConfig>govregistry</dbConfig> <readOnly>false</readOnly> <enableCache>true</enableCache> <registryRoot>/</registryRoot> </remoteInstance> <mount path="/_system/governance" overwrite="true"> <instanceId>gov</instanceId> <targetPath>/_system/governance</targetPath> </mount> <mount path="/_system/config" overwrite="true"> <instanceId>gov</instanceId> <targetPath>/_system/config</targetPath> </mount>
<DAS_HOME>/repository/conf/user-mgt.xml
<Property name="dataSource">jdbc/WSO2_DAS_UM</Property>
Here we are using 2 analyzer nodes , as we are using this setup as a highly available setup we need to have 2 spark master nodes. For that need to change master count as 2 in spark-defaults.conf files 2 anlyser nodes .
<DAS_home>/repository/conf/analytics/spark/spark-defaults.conf
carbon.spark.master.count 2
carbon.das.symbolic.link /home/das_symlink
ln-s /path/to/file /path/to/symlink eg: sudo ln -s /home/analyzer1/analyzer /home/das_symlink
Starting the cluster
When starting the instances you can provide predefined profiles to start the instances as receiver nodes, analyzer nodes or Indexer node.
./wso2server.sh -receiverNode start
./wso2server.sh -analyzerNode start
./wso2server.sh -receiverNode start
When you starting dashboard node you need to add some start up parameters to wso2server.sh file<DAS_home>/bin/wso2server.sh
$JAVACMD \ -Xbootclasspath/a:"$CARBON_XBOOTCLASSPATH" \ -Xms256m -Xmx1024m -XX:MaxPermSize=256m \ -XX:+HeapDumpOnOutOfMemoryError \ -XX:HeapDumpPath="$CARBON_HOME/repository/logs/heap-dump.hprof" \ $JAVA_OPTS \ -Dcom.sun.management.jmxremote \ -classpath "$CARBON_CLASSPATH" \ -Djava.endorsed.dirs="$JAVA_ENDORSED_DIRS" \ -Djava.io.tmpdir="$CARBON_HOME/tmp" \ -Dcatalina.base="$CARBON_HOME/lib/tomcat" \ -Dwso2.server.standalone=true \ -Dcarbon.registry.root=/ \ -Djava.command="$JAVACMD" \ -Dcarbon.home="$CARBON_HOME" \ -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager \ -Dcarbon.config.dir.path="$CARBON_HOME/repository/conf" \ -Djava.util.logging.config.file="$CARBON_HOME/repository/conf/etc/logging-bridge.properties" \ -Dcomponents.repo="$CARBON_HOME/repository/components/plugins" \ -Dconf.location="$CARBON_HOME/repository/conf"\ -Dcom.atomikos.icatch.file="$CARBON_HOME/lib/transactions.properties" \ -Dcom.atomikos.icatch.hide_init_file_path=true \ -Dorg.apache.jasper.compiler.Parser.STRICT_QUOTE_ESCAPING=false \ -Dorg.apache.jasper.runtime.BodyContentImpl.LIMIT_BUFFER=true \ -Dcom.sun.jndi.ldap.connect.pool.authentication=simple \ -Dcom.sun.jndi.ldap.connect.pool.timeout=3000 \ -Dorg.terracotta.quartz.skipUpdateCheck=true \ -Djava.security.egd=file:/dev/./urandom \ -Dfile.encoding=UTF8 \ -Djava.net.preferIPv4Stack=true \ -Dcom.ibm.cacheLocalHost=true \ -DdisableAnalyticsStats=true \ -DdisableEventSink=true \ -DisableIndexThrottling=true \ -DenableAnalyticsStats=true\ -DdisableAnalyticsEngine=true \ -DdisableAnalyticsExecution=true \ -DdisableIndexing=true \ -DdisableDataPurging=true \ -DdisableAnalyticsSparkCtx=true \ -DdisableAnalyticsStats=true \ org.wso2.carbon.bootstrap.Bootstrap $* status=$? done
./wso2server.sh -receiverNode -Dsetup start
Start servers according to following sequence.
1 Receiver Nodes
2 Analyzer Nodes
3 Indexer Nodes
4 Dashboard Node
2 Analyzer Nodes
3 Indexer Nodes
4 Dashboard Node
If cluster is successfully setup, Members successfully join cluster message can be seen in the carbon log
TID: [-1] [] [2015-10-21 12:07:50,815] INFO {org.wso2.carbon.core.clustering.hazelcast.wka.WKABasedMembershipScheme} - Member joined [6974cb1c-8403-4711-9408-9de0cfaadda2]: /192.168.48.25:4000 {org.wso2.carbon.core.clustering.hazelcast.wka.WKABasedMembershipScheme} TID: [-1] [] [2015-10-21 12:08:05,043] INFO {org.wso2.carbon.core.clustering.hazelcast.wka.WKABasedMembershipScheme} - Member joined [3ebbg27b-91db-4d98-8c8a-95e2604e3a9c]: /192.168.48.25:4000