The Syncfusion Data Integration Platform is an easy-to-use, powerful, and reliable system for processing and distributing data that helps automate the flow of data between systems. It provides tools to prepare (ETL or ELT) and blend data from a variety of data sources for generating analytics-ready data, which can then be fed into target applications for dashboards, data warehousing, and business intelligence.
The Syncfusion Data Integration Platform is useful in the following use cases which cannot be achieved in our Dashboard Platform alone:
- Complex ETL operations like joining two tables from two different databases.
- Fetch web data (using REST calls) from multiple sources and move it into a single target table. Then, connect the target table in a dashboard for visualization.
- Support for the following data sources:
- JDBC connections(SQL Server, MySQL, Oracle, etc.)
- REST API
- Social feeds (Twitter, Google Analytics, etc.)
- No SQL (Cassandra, Mongo, HBase, Elasticsearch, etc.)
- File formats (JSON, XML, CSV, etc.)
- Big data (HDFS, HBase, Flume)
- We can also create our own processors using C# or Java.
The Syncfusion Dashboard Designer can establish a connection with the Syncfusion Data Integration Platform (DIP) Server and access its data flows. This makes it easy to consume blended data from different data connections as a data flow from the DIP Server through a target server or file from the Dashboard Designer.
Creating a data flow with the Data Integration Platform
To try this yourself, first download and install the Syncfusion Data Integration Platform as prescribed here.
Once the installation and suggested configurations are done, you will be directed to the home page of the data integration web application in browser. Log in with the created user account or the default one and start creating the data flow with the required processors.
Here is a simple illustration of a data flow where data is fetched from a CSV file and written to a JSON file.
Figure 1: Data Integration Platform user interface for designing your workflow
In the Read input csv file processor, under its configuration settings, specify the appropriate CSV input file path. Likewise, in the Store output file processor, under its configuration settings, specify the JSON output file path where the data is to be saved. To consume the data from the target file path to the Dashboard Designer, add the PublishDataSource processor as the endpoint.
Figure 2: Processor configuration settings
Execute the data flow to initiate the process of fetching data from a source CSV file to the target JSON file. This process can also be scheduled for capturing the latest data updates. Now the data will be saved in JSON format in the specified path. The data is now ready to be consumed from the dashboard.
Adding the data flow as a dashboard data source
Open the Syncfusion Dashboard Designer application. Expand the Server Explorer panel from the side bar on the left. Expand Add Server and select DIP Server to add a new DIP Server connection. In the prompted login window, enter the hosted URL of the data integration application where the data flow was created and the user credentials.
Figure 3: DIP login in Dashboard Designer
On successful login, you may get the data flows that are accessible for that account listed under their respective servers in the Server Explorer window. Here, the CsvToJson data flow we created is listed.
Figure 4: Viewing DIP data flow in Dashboard Designer
Right-click the CsvToJson data flow and select the Create Data Source option in the context menu. Now, the resultant data source from the data flow, the JSON file data, can be created as a new data source in the Dashboard Designer.
Figure 5: Adding the target data source in Dashboard Designer
Figure 6: JSON data source in Dashboard Designer table canvas
With this data source added, you can design your dashboard. See our documentation for help getting started.
Scheduling in the Data Integration Platform
The Syncfusion Data Integration Platform provides scheduling options as well to keep your target data source up to date. You only need to do a couple things:
- Go to the configuration settings for a processor in the data flow and schedule the time based on your convenience.
- Always keep the Data Integration Platform instance running on your machine.
Let’s say we want to schedule a daily data update at 4 A.M. for the data flow we created in this blog. Here’s what we have to do:
- Right-click the Read input csv file processor and select the Configure option.
Note: Usually the initial processor in the data flow which retrieves data from a source database is preferred for scheduling.
- Open the Scheduling tab and select CRON driven from the Scheduling Strategy drop-down list.
- In the Run Schedule text box, enter the CRON code 0 0 4 1/1 * ? * to schedule a daily data update at 4 A.M.
The time settings of the machine the DIP service is running on will be considered for this purpose.
Figure 7: Scheduling refresh every day at 4 A.M.
To learn more about scheduling, see our documentation.