We use cookies to give you the best experience on our website. If you continue to browse, then you agree to our privacy policy and cookie policy.
Unfortunately, activation email could not send to your email. Please try again.

Syncfusion Dashboard: Hive Datasource fetch very slow

Thread ID:





131329 Jun 28,2017 01:08 PM Jul 5,2017 08:22 AM Dashboard Platform 3
Tags: Dashboard Datasources
Ilya Bo
Asked On June 29, 2017 08:44 AM

HiI just setted up Syncfusion Dashboard platform and imported some test data via Integration platform in Hadoop (Avro files).Then I put data from Hadoop to Hive tables (convert Avro to table), in total I just have 400 rows.Then I used this Hive as Datasource for Grid Dashboard.When I try to fetch data it works very slowly.My computer has 64 gb RAM and SSDs but fetching 400 rows of data takes 45 seconds..Can somebody point me how to figure out it? All settings in Syncfusion are by default.Thanks! 

Nandhini K [Syncfusion]
Replied On July 3, 2017 08:31 AM

Hi  Bochkov, 
Thanks for using  Syncfusion products. 
Please find the response for your query as below, 
When I try to fetch data it works very slowly.My computer has 64 gb RAM and SSDs but fetching 400 rows of data takes 45 seconds..Can somebody point me how to figure out it? 
Reason for Slow Performance: 
  • As the data processing using Hive Server2 involves MapReduce process with multiple disk read/write operations, it will take considerable time for both small and large data set.
Please find a sample query that will be generated by the dashboard to fetch data by hive server and bind the data in the selected widgets in the dashboard as follows, It is a sample query by adding the “contactname” column from the table “ Customer2” into the selected widget. 
SELECT Sub_Table.Grid_Column_0 AS Grid_Column_0  FROM (SELECT customers2.contactname AS Grid_Column_0 ,ROW_NUMBER( ) OVER(ORDER BY customers2.contactname ASC) AS RowIndexColumn FROM default.customers2 AS customers2 GROUP BY customers2.contactname) Sub_Table WHERE RowIndexColumn BETWEEN 1 AND 200; 
  • Please find the metrics of the count and above mentioned query in both Hive Server2 and in Spark SQL.
Hive Server2 
Spark SQL 
Select query with groupBy and orderBy elements in the table created using avro file 
66 seconds 
5 seconds 
Count query  
30 seconds 
0.2 seconds 
Recommended Solution: 
As Hive Server2(Map Reduce) is well suited for batch processing with large data set, We recommend you to use Spark SQL data source for near real time analytics such as dashboard visualization. Because Spark SQL process data in-memory to avoid multiple disk I/O operations. 
  • Tables created under the Hive can also be accessed from “Spark SQL” in Syncfusion distribution as both uses same meta store database.
  • So you can use the “Spark SQL” connection type in Syncfusion Dashboard platform instead of “Hive”.
Nandhini K.

Ilya Bo
Replied On July 4, 2017 05:18 AM

Nandhini, thanks for your answer!

One more thing I would like to ask:

When I try to create Spark SQL data source I don't see my tables.

I created tables in several ways:

  1. I created test table by attached sample (scala).
  2. Also I created it by using AvroSerDe (hql): 
But when I use Hive data source I see them. What is the problem?


Attachment: AvroFileSchema1089509997_df60505f.zip

Dhivyabharathi Govindaraj [Syncfusion]
Replied On July 5, 2017 08:22 AM

Hi Ilya, 
We had created a new support incident under your Direct Trac account since the reported query is considered as an issue. Please follow the link below to access your account.  


This post will be permanently deleted. Are you sure you want to continue?

Sorry, An error occured while processing your request. Please try again later.

You are using an outdated version of Internet Explorer that may not display all features of this and other websites. Upgrade to Internet Explorer 8 or newer for a better experience.