Big Data is no longer under active development.
Syncfusion Feedback
  • BIG DATA PLATFORM
  • DEVELOPER SDK
  • WHY SYNCFUSION
  • FAQ
  • RESOURCES

The Syncfusion Big Data Platform

The Syncfusion Big Data Platform is the first and the only complete Hadoop distribution designed for Windows and Linux. Develop on Windows using familiar tools and deploy on both Windows and Linux.

In today's data-driven world, managing large quantities of data has become an important ingredient in business success. Managing big data comes with several challenges, including cost-effective storage and querying across structured and unstructured data. It also offers tremendous potential. Imagine not having to decide which of your data is valuable. Imagine being able to store any amount of data using commodity hardware with linear scalability. The Hadoop environment makes all of this, and much more, possible today.

And now, Syncfusion has made these powerful technologies available on both Windows and Linux. With the Syncfusion Big Data Platform, you have complete access to the Hadoop environment. By adopting our platform, you are using an industry-tested solution currently employed by companies such as Microsoft, Facebook, Amazon, Adobe, Hulu, LinkedIn, and Yahoo.

Enterprise-grade Hadoop cluster

The Syncfusion Big Data Platform includes a complete production environment that can run Hadoop jobs in a scalable manner on a full cluster. The included cross platform Cluster Manager application makes provisioning easy, allowing you to manage and monitor multiple-node Hadoop clusters on both Windows and Linux.

Big Data Video Icon
Hadoop Clusters on Windows and Linux

Easily create production Hadoop clusters on Windows and Linux

With the Syncfusion Hadoop distribution, in minutes you can create clusters using commodity machines running the most recent versions of Windows (Windows 7/Windows Server 2008 and later) and on Linux (Ubuntu, CentOS).

Manage Multiple Clusters

Manage and monitor multiple clusters

Our cluster management system comes with built-in support for managing multiple clusters. Easily manage and monitor multiple clusters. All you need is a web browser.

Job Monitor Details

Easily monitor job status

The included monitoring system provides information on the general health of your cluster. It also provides detailed metrics on nodes that are part of the cluster and jobs running on the cluster.

Cluster Upgrade

Hadoop cluster upgrade

You can easily perform Hadoop rolling upgrade to update your Hadoop cluster from an older version to a newer one without incurring data loss or downtime in accessing HDFS. You can also upgrade your SDK packages that are shipped with Hadoop.

PySpark and Scientific Python

Integrated support for PySpark and Scientific Python

Cleanly integrated support for the scientific Python stack with features such as integrated Python package management. Commonly used packages are bundled with an easy to use interface to update and add packages as needed. Visualize large amounts of data using Spark and rich visualization features available as part of the included IPython stack.

Caching Spark SQL

Caching - Spark SQL

Spark excels in processing in-memory data. It supports loading data into a cluster-wide in-memory cache to deliver the best performance in data visualization and retrieval. With Cluster Manager, you can cache tables in the Spark Thrift Server with a few clicks.

Prepare and Manage Oozie jobs

Prepare and manage Oozie jobs

Oozie is a workflow scheduler system to manage Hadoop jobs. The Syncfusion Big Data Platform makes the job submission and monitoring easier by providing a user-friendly interface to submit jobs and a dashboard for monitoring.

Prepare and Run Sqoop jobs

Prepare and run Sqoop jobs

Sqoop helps you transfer data from relational databases into HDFS. The Syncfusion® Big Data Platform makes the job even easier by providing a user-friendly interface to Sqoop.

Azure automation

Azure automation

You can easily create, deploy, and scale a secure Syncfusion Hadoop cluster with basic or Kerberos enabled authentication in a Microsoft Azure Virtual Machines environment in minutes. The Syncfusion Big Data Cluster Manager allows you to effectively manage the resources in Microsoft Azure with options to track billing details and shut down, restart, and destroy the virtual machines as required. You can also start and stop the virtual machines with the Hadoop cluster at scheduled intervals. Azure based Hadoop Cluster also supports Azure blob storage as default File System, With it you can easily scale the storage as well as you can switch between hot and cool access tiers.

Pseudo Node Hadoop cluster

Syncfusion Big Data Platform Sandbox

Syncfusion® has published a sandbox Azure image containing pseudo node Hadoop cluster. You can easily provision the image from Azure portal itself and start exploring the pseudo node Hadoop cluster once the VM is up and running. You can also directly connect it with Big Data Studio without using VPN and submit jobs.

Secure Hadoop Cluster

Create secure Hadoop cluster

You can easily create a Kerberos enabled secure Hadoop cluster in both Windows and Linux within minutes. The Syncfusion Big Data Cluster Manager facilitates seamless integration with Active Directory Server. You can easily manage the access control for HDFS, Hive, and HBase to the users.


  • hbase
  • Powerful HBase support

    The platform comes with integrated support for HBase. Store and process massive amounts of data in a horizontally scalable manner. Achieve high speed reads and writes. HBase is quickly becoming the NoSQL store of choice and the Syncfusion Big Data Platform makes it easier to deploy than ever before.

  • config
  • Absolutely no manual configuration

    When creating a cluster, manual configuration is not required. The cluster configuration and management system we provide takes care of everything.

  • list
  • No lengthy list of prerequisites

    All you need on each node is a small agent that depends on .NET Framework 4.5 or later on Windows and .NET Core (self-contained) on Linux. There are absolutely no other dependencies. The cluster manager will then coordinate the complete installation process with the agent.

  • automate
  • Automate agent installation with Windows PowerShell/Active Directory

    While the agent can be manually installed for smaller clusters, we provide scripts that can automate the installation of agents to a large number of nodes in a completely unattended manner.

  • hadoop
  • Run Hadoop jobs written in any language, including C#

    The Syncfusion Big Data platform can run jobs written in any language. Java, Pig, Hive, Python, and Scala are all supported. There is also complete support for C# or any other .NET platform language.

  • yarn
  • YARN-enabled cluster

    The Syncfusion Big Data platform comes ready for YARN applications. You can run any YARN-compatible application system. You are not limited to MapReduce.

  • hive
  • Access Hive data directly from C#

    Data stored in the Hadoop distributed file system (HDFS) can be directly accessed from your C# applications through Hive. We ship an extensive set of C# samples that demonstrate access under different scenarios.

  • odbc
  • Hive/HBase data access–nothing more is needed

    No ODBC driver or other extensions are needed. Everything you need is included.

  • apache
  • Work with Apache Spark

    Syncfusion Big Data Studio includes complete access to the Spark environment over YARN. Take advantage of new higher level paradigms for processing massive amounts of data faster than ever before.

  • scale
  • Scale as you need

    You can start with a 5- or 10-node cluster and scale as you need. Hadoop has scaled to thousands of nodes and can grow with your business needs.

  • commodity
  • Commodity hardware

    When we say commodity hardware, we really mean it. You can buy any reasonably capable, desktop-grade hardware and put it to good use. As your needs grow, you can move to higher-grade hardware.

  • azure
  • Run on Azure/AWS/other cloud virtual machines

    You can also run your own custom-configured Hadoop cluster on Azure or AWS virtual machines. Unlike Hadoop in cloud services, you have complete and direct control over your cluster.

  • promise
  • Our promise: Up and running in 15 minutes or less

    We have fine-tuned the entire experience starting from the moment you download the product. We guarantee zero manual configuration whether you are configuring a small cluster or a much larger one.

Syncfusion Big Data Studio

The Syncfusion Big Data Studio provides an easy-to-use environment to work with popular big data tools such as Spark, HBase, Pig and Hive. It also provides direct access to the Hadoop Distributed File System, HDFS. The Big Data Studio ships with a local install of the Syncfusion Big Data SDK, which provides a complete working Hadoop distribution right on your laptop. No virtual machines are needed, so there is no need to juggle between Linux and Windows. You don’t even have to be connected to a cluster to work on Hadoop jobs. You can work with Hadoop on your Windows machine, even when offline, and then deploy to a cluster for production when you are ready.

BigData Development Video Icon
HDFS explorer

Interactive HDFS explorer

Syncfusion Big Data Studio includes a full-fledged explorer-UI that allows for easy interaction with files stored within HDFS.

Work Interaction

Work interactively with Pig, Hive, Spark and HBase

Work interactively with Pig, Hive, HBase and Spark(Scala, Python, IPython and Spark SQL). Syncfusion® Big Data Studio provides an interactive command line interface and a rich editor for working with Pig, Hive, HBase and Spark. Use the power of a read-eval-print loop to get work done quickly.

Sqoop jobs

Prepare and run Sqoop jobs

Sqoop helps you transfer data from relational databases into HDFS. The Syncfusion Big Data Platform makes the job even easier by providing a user-friendly interface to Sqoop.

Submit Jobs in Hadoop Clusters

Submit jobs in secure Hadoop cluster

You can easily add a Kerberos enabled secure Hadoop cluster and submit jobs in Hadoop, Sqoop, Pig, Hive, Spark, and HBase within Big Data Studio.

Experimental connect support

Support to connect with Azure Cluster

You can directly connect with Syncfusion Hadoop cluster running in Azure without the need for VPN connection. Traffic is sent over SSL and the connection is authenticated.


  • developers
  • Developers–we have you covered

    Syncfusion ships a unique, local, single-node distribution of Hadoop complete with an interactive development environment. You can install your own local version of Hadoop with no dependencies other than the .NET framework.

  • curve
  • Skip the learning curve

    Hadoop development on Windows today involves dealing with Linux virtual machines, and using command-line tools to get most of the work done. With Syncfusion, you get a native Windows experience with absolutely no need to run virtual machines.

  • local
  • Local version is ideal for development

    Get all your scripts tested and working before running them on a cluster. Scripts are 100% compatible, not just with Syncfusion production clusters, but with any other Hadoop distribution.

  • offline
  • Work offline as needed

    You can also work offline with the local cluster. You can now author big data applications wherever you may be.

  • samples
  • Samples, samples, and more samples

    The Syncfusion Big Data Platform provides several complete samples written using Pig, Hive, Java, Scala, Python and C#. These samples are designed to help you start quickly.

  • hdinsight
  • 100% Azure HDInsight compatible

    You can seamlessly deploy to Microsoft Azure HDInsight. Azure HDInsight is a cloud implementation of Hadoop provided by our partner Microsoft.



Why Syncfusion

  • Community license
  • Community License available

    If you qualify for our community license, the Syncfusion Big Data Platform is available to you completely free.

  • commercial
  • Full commercial support by Syncfusion

    The commercial version of the Syncfusion Big Data Platform comes bundled with technical support. There are no limits to the number of support incidents you can create. You can interact directly with our big data team through our unique Direct-Trac support system.

  • priced
  • Reasonably priced commercial edition

    You will find our pricing for commercial support to be a breath of fresh air. Contact us today to get started.

  • customers
  • Syncfusion customers–you are covered already

    If you are a Syncfusion customer with an enterprise-scope license, you are likely already covered with substantial support benefits. Simply log into Direct-Trac or contact Syncfusion for additional details.

  • consulting
  • End-to-end consulting services available

    At Syncfusion, we have deep expertise on the Hadoop stack. With our extensive experience building the Syncfusion Big Data Platform, we can build end-to-end big data solutions for you better and faster than anyone else. Contact us today.

Frequently Asked Questions

Collapse All

Licensing

Can the entire product, including the production cluster, be used commercially?

Yes, if you qualify for a community license or if you obtain a commercial license.

Support

What are the support options that are available?

Forum support is available to everyone for free. Commercial support under a defined SLA is available for an annual fee. Details of the fees are given in the following FAQ.

Are there limits to the number of incidents that can be submitted?

No. As a matter of policy, Syncfusion® does not normally limit the number of support incidents for those under commercial support.

How can I submit feature requests?

You can log feature requests through our Direct-Trac customer-service portal.

Where do I report bugs?

You can report bugs using our Direct-Trac customer-service portal.

Does Syncfusion offer paid consulting services in the big data domain?

Yes. Please contact us for additional information.

Benefits for Syncfusion customers

Are there special benefits for Syncfusion customers?

Yes. If you are a Syncfusion® Plus member, you as a named user, will automatically receive commercial support for one cluster up to a maximum of 5 nodes (cluster limit is a per-organization limit). In addition, current Syncfusion® Global Enterprise License holders receive commercial support at the Platinum level for no charge.

I have a Syncfusion Essential Studio Community License. Do I receive access to the Big Data Platform?

Yes. You will receive commercial support on a single cluster for up to 5 nodes.

Software Requirements

What are the requirements to run the Syncfusion Big Data Platform?

Windows 7 or a later version with .NET Framework 4.5 and Linux (Ubuntu, CentOS).For a production cluster, we recommend that you deploy on Windows Server 2008 and later or Linux (Ubuntu, CentOS).

Do I need to install Cygwin, Python, Java, etc?

No. We automatically handle all dependencies. Cygwin is not used by the Syncfusion® Big Data Platform.

Hardware Requirements

Do you recommend specific types of hardware?

The Syncfusion® Big Data Platform runs on a variety of hardware. We recommend you start with new hardware with 16 GB of RAM or more, and one or more disks with sufficient capacity for your needs. RAID is not required for data nodes. RAID is a good idea for name nodes. Specification-wise, you can certainly go as high as you want once your needs expand. You can contact us for specific advice based on your requirements.

As an example, we have a system for gathering and summarizing product metrics running on a small cluster with off-the-shelf, desktop-quality hardware. It has delivered tremendous value over the past few months. We also run other clusters that are much larger. Don't feel the need to scale up when you get started. Start with the basics. One of the nice things is that it is very simple to scale up as you need.

Technical

Can the Syncfusion Big Data Platform be used to configure clusters using virtual machines on cloud servers?

Yes. We test this use-case using Microsoft Azure and other cloud providers.

Is there support for YARN?

Yes. The core version of Apache Hadoop we ship is 2.5.2 or higher, and it comes with complete support for YARN.

Is there complete support for Apache Pig?

Yes.

Is there complete support for Apache Hive?

Yes.

Is HBase supported?

Yes.

Can I connect to data stored on HDFS from my C# applications?

Yes. We ship several samples with the Syncfusion® Big Data Studio installation.

Can I use C# to author MapReduce applications?

Yes. We ship several samples with the Syncfusion® Big Data Studio install. You can write code or use third-party assemblies as needed.

Can I use Java to author MapReduce applications?

Absolutely. We ship several samples with the Syncfusion® Big Data Studio install.

Can I use Python or other languages?

Absolutely. We ship several Python samples with the product.

Do I have to make any additional changes to enable high availability on the cluster?

No. Support for high availability is built-in and works out of the box with Syncfusion® Big Data Platform.

Is Syncfusion Big Data Platform compatible with Microsoft HDInsight?

Yes. Our platform is compatible with the Microsoft HDInsight platform. Identical code will run on both platforms. Please note that our interactive studio environment does not however support direct connections to the HDInsight platform.

Is there a limit on cluster size?

Not that we know of. From a practical perspective, we test clusters of up to 100 nodes. Custom tuning will almost certainly be required for very large clusters. Contact us for assistance.

Do I need to have DNS configured properly to install Syncfusion Big Data Platform? Can I use IP addresses instead?

You need to have DNS and reverse DNS configured properly to install Syncfusion® Big Data Platform. Configuration with IP addresses is not recommended and is not supported by the cluster manager. If you provide IP addresses, we will translate these into host names, assuming DNS is configured as expected.

Other questions

How does Syncfusion Big Data Platform compare with distributions from other vendors?

Most current distributions are focused on the Linux platform alone. They are not easy to setup and maintain on Windows. We aim to provide a solid, big data platform tailored for both Windows and Linux. Additionally, the platform is designed to offer a lot of additional features such as a powerful cross platform cluster manager that minimize the efforts required to get started.

About Syncfusion

Who is Syncfusion?

We have been in business since 2001. We are one of the largest providers of software frameworks in the world. We provide frameworks to approximately half of the Fortune 500 companies, and we globally have over half a million users. Some of the best-known software packages in the world are built with our technology under the hood.