Syncfusion Dashboard: Hive Datasource fetch very slow

3 Replies
3 Participants

Created by
IB Ilya Bo

Platform
Dashboard Platform

Platform
Dashboard Platform

Control
Dashboard Datasources

Created On
Jun 28, 2017 05:08 PM UTC

Last Activity On
Jul 5, 2017 12:22 PM UTC

Want to subscribe?
SIGN IN

HiI just setted up Syncfusion Dashboard platform and imported some test data via Integration platform in Hadoop (Avro files).Then I put data from Hadoop to Hive tables (convert Avro to table), in total I just have 400 rows.Then I used this Hive as Datasource for Grid Dashboard.When I try to fetch data it works very slowly.My computer has 64 gb RAM and SSDs but fetching 400 rows of data takes 45 seconds..Can somebody point me how to figure out it? All settings in Syncfusion are by default.Thanks!

3 Replies

NK Nandhini K Syncfusion Team July 3, 2017 12:31 PM UTC

Hi Bochkov,

Thanks for using Syncfusion products.

Please find the response for your query as below,

Query

Response

When I try to fetch data it works very slowly.My computer has 64 gb RAM and SSDs but fetching 400 rows of data takes 45 seconds..Can somebody point me how to figure out it?

Reason for Slow Performance:

As the data processing using Hive Server2 involves MapReduce process with multiple disk read/write operations, it will take considerable time for both small and large data set.

Please find a sample query that will be generated by the dashboard to fetch data by hive server and bind the data in the selected widgets in the dashboard as follows, It is a sample query by adding the “contactname” column from the table “ Customer2” into the selected widget.

SELECT Sub_Table.Grid_Column_0 AS Grid_Column_0 FROM (SELECT customers2.contactname AS Grid_Column_0 ,ROW_NUMBER( ) OVER(ORDER BY customers2.contactname ASC) AS RowIndexColumn FROM default.customers2 AS customers2 GROUP BY customers2.contactname) Sub_Table WHERE RowIndexColumn BETWEEN 1 AND 200;

Metrics:

Please find the metrics of the count and above mentioned query in both Hive Server2 and in Spark SQL.

Query	Hive Server2	Spark SQL
Select query with groupBy and orderBy elements in the table created using avro file	66 seconds	5 seconds
Count query	30 seconds	0.2 seconds

Recommended Solution:

As Hive Server2(Map Reduce) is well suited for batch processing with large data set, We recommend you to use Spark SQL data source for near real time analytics such as dashboard visualization. Because Spark SQL process data in-memory to avoid multiple disk I/O operations.

Tables created under the Hive can also be accessed from “Spark SQL” in Syncfusion distribution as both uses same meta store database.
So you can use the “Spark SQL” connection type in Syncfusion Dashboard platform instead of “Hive”.

Regards,

Nandhini K.

IB Ilya Bo July 4, 2017 09:18 AM UTC

Nandhini, thanks for your answer!

One more thing I would like to ask:

When I try to create Spark SQL data source I don't see my tables.

I created tables in several ways:

I created test table by attached sample (scala).
Also I created it by using AvroSerDe (hql):

But when I use Hive data source I see them. What is the problem?

Thanks!

Attachment: AvroFileSchema1089509997_df60505f.zip

DG Dhivyabharathi Govindaraj Syncfusion Team July 5, 2017 12:22 PM UTC

Hi Ilya,

We had created a new support incident under your Direct Trac account since the reported query is considered as an issue. Please follow the link below to access your account.

https://www.syncfusion.com/support/directtrac/incidents

Regards,

Dhivya

3 Replies
3 Participants
Want to subscribe?
SIGN IN
Created by
IB Ilya Bo
Platform
Dashboard Platform
Control
Dashboard Datasources
Created On
Jun 28, 2017 05:08 PM UTC
Last Activity On
Jul 5, 2017 12:22 PM UTC

Viewer Component

.NET PDF Processing Library

Conversions

Editor Component

.NET Word Processing Library

Conversions

Editor Component

.NET Excel Processing Library

Conversions

.NET PowerPoint Processing Library

Conversions

Syncfusion Dashboard: Hive Datasource fetch very slow

Enterprise Solutions

Free Products

Viewer Component

.NET PDF Processing Library

Conversions

Editor Component

.NET Word Processing Library

Conversions

Editor Component

.NET Excel Processing Library

Conversions

.NET PowerPoint Processing Library

Conversions

Learning

Resources

Support

Syncfusion Dashboard: Hive Datasource fetch very slow