We use cookies to give you the best experience on our website. If you continue to browse, then you agree to our privacy policy and cookie policy. Image for the cookie policy date
close icon

Syncfusion with Hortonworks and HDInsight Clusters

Hello,

We are currently doing our POC with Hortonworks HDP for On-Premise and MS Azure HDInsight for Cloud.
We have created clusters and both Hortonworks HDP and HDinsight are in different network.
My question is I'm trying to run Syncfusion from windows machine and trying to connect to the existing Hortonworks and HDInsight clusters and finding difficulty with it.
Is it possible to run my pig, hive scripts on top of HDInsight cluster or HDP cluster? If so how?
Kindly let me know the feasibility and documentation for the same.

Many Thanks,
Pradip

6 Replies

PP Praveena P Syncfusion Team November 18, 2015 11:32 AM UTC

Hi Pradip,

At Present,  Syncfusion Distribution of Big Data Platform do not have support for HDInsight cluster or Hortonworks HDP clusters.

Syncfusion Big data studio was designed by keeping Syncfusion Hadoop distribution for windows in mind. We have created agent services (remote and installer agent) as a medium to establish communication (start, stop services and interactive job submission etc.) between Hadoop services and Big data studio. 
Azure HDInsight uses Hortonworks data platform distribution and they have their own HDInsight SDK to interact with it. Hence it is not possible for running pig, hive scripts on top of HDInsight cluster or HDP cluster using our Syncfusion Big data Platform.

However, we do have support for deploying Syncfusion Hadoop cluster in Azure VM environment. After creating Hadoop clusters in Azure using Syncfusion Big Data Platform, we can interact and submit your pig also hive scripts using Syncfusion Big Data Studio by installing studio build in one of the Azure VM within the same virtual network or by establishing point to site VPN connection with Azure VM – virtual network. 

To create Hadoop cluster on Microsoft Azure using Syncfusion BigData Platform, please find the following link to download the help document:

For older version 2.1.0.77 or lower, refer here. https://help.syncfusion.com/bigdata/cluster-manager/cluster-creation#cluster-installation-on-microsoft-azure

For version 2.9.0.2 or higher (preview builds), refer here. https://www.syncfusion.com/downloads/support/forum/121177/ze/Azure_Cluster_Creation-2086244975

Kindly find the following link for steps to enable point to site connection for accessing Hadoop cluster in Azure VM within Syncfusion Big Data Studio if installed in separate network.
https://www.syncfusion.com/downloads/support/forum/121177/ze/BigDataStudioIntegration-788310818

To know more about big data studio and cluster manager refer to the following link:
https://help.syncfusion.com/bigdata/bigdata-studio/overview
https://help.syncfusion.com/bigdata/cluster-manager/overview

Regards,
Praveena.


PV Pradip VS November 18, 2015 05:22 PM UTC

Thanks a lot Praveena for the detailed clarification. I indeed tested it in Azure VM and it worked fine but created issues while connecting to HDInsight and HDP clusters.
Thank you once again.

Regards,
Pradip


PV Pradip VS November 18, 2015 06:26 PM UTC

Hi Praveena,

One more clarification. i can see in the page - http://www.syncfusion.com/products/big-data
that "100% Azure HDInsight compatible. You can seamlessly deploy to Microsoft Azure HDInsight. Azure HDInsight is a cloud implementation of Hadoop provided by our partner Microsoft."
Please find the attached pics. Kindly clarify.

Thanks,
Pradip

Attachment: SyncFusion_HDInsight_c0339404.zip


MK Madhan Kumar S Syncfusion Team November 19, 2015 02:39 PM UTC

Hi Pradip,
Sorry for the miscommunication caused. Please find the details below.

"100% Azure HDInsight compatible. You can seamlessly deploy to Microsoft Azure HDInsight. Azure HDInsight is a cloud implementation of Hadoop provided by our partner Microsoft."

The solution you develop with our product (Syncfusion big data platform) is fully compatible with Azure HDInsight (i.e.) you can deploy the solution based on scripts (Pig, hive, Spark, HBase) developed through big data studio or C# MapReduce samples (uses custom HDInsightSDK) or C# samples for Hive, Spark and HBase (uses Syncfusion Thrift libraries) with Azure HDInsight.

This was the message conveyed in the given screenshot.


Regards,
Madhan Kumar S



PV Pradip VS November 19, 2015 07:36 PM UTC

Hi Mathan,

Thanks so much for the clarification.  Can you kindly send me the documentation links on how i can integrate syncfusion with hdinsight and run my scripts or deploy my solutions?
I'm unable to find documentation for the same. Kindly help.

Many Thanks,
Pradip


RE Rengasamy Syncfusion Team November 25, 2015 11:55 AM UTC

Hi Pradip,
Thank you for your update.
At present, you cannot directly interact/submit jobs in HDInsight cluster using Syncfusion Big data studio. So, We kindly request you to use the Python, C#, Azure PowerShell, Zeppelin, etc. to deploy the scripts (solutions) developed based on PIG, HIVE, Spark etc. with our Syncfusion Big Data Studio to HDInsight cluster (Hadoop / Spark ).
Please refer to the following link for more information,
https://azure.microsoft.com/en-in/documentation/articles/hdinsight-submit-hadoop-jobs-programmatically/
https://azure.microsoft.com/en-in/documentation/articles/hdinsight-use-hive/
https://azure.microsoft.com/en-in/documentation/articles/hdinsight-use-pig/
https://azure.microsoft.com/en-us/documentation/articles/hdinsight-apache-spark-zeppelin-notebook-jupyter-spark-sql/#jupyter 
Note : In this case you have to modify the input and output path in your script based on the blob storage account used in HDInsight cluster creation.
Please let us know if you have any doubt on this.
Regards,
Rengasamy

Loader.
Live Chat Icon For mobile
Up arrow icon