We use cookies to give you the best experience on our website. If you continue to browse, then you agree to our privacy policy and cookie policy.
Unfortunately, activation email could not send to your email. Please try again.

Syncfusion Big Data: query data from HDFS in Avro format

Thread ID:

Created:

Updated:

Platform:

Replies:

131126 Jun 22,2017 06:41 AM Jun 27,2017 09:02 AM Big Data Platform 3
loading
Tags: General
Ilya Bo
Asked On June 22, 2017 06:41 AM

I have data which stored in HDFS in Avro format.
How can I query it via Syncfusion Big Data Studio by Spark SQL?

Thanks!

Aravindraja Thinakaran [Syncfusion]
Replied On June 23, 2017 03:20 AM

Hi Ilya, 

Thanks for contacting Syncfusion support. 

Please check with below steps to access AVRO files which available in HDFS using SparkSQL from Big Data Studio. 

Step 1: Download and extract the spark-avro_2.11-3.2.0.jar file and copy the jar to below location. 
<Install Drive>:\Syncfusion\BigData\<Install Drive>\BigDataSDK\SDK\Spark\jars\ 
 
Step 2: Restart Spark Thrift server service from Service Manager. 

Step 3: Use “/Data/Spark/Resources/Users.avro” as input file from HDFS to create table in Spark SQL using below command. 
CREATE TABLE Users USING com.databricks.spark.avro OPTIONS (path "/Data/Spark/Resources/Users.avro"); 

Step 4: After table created use below command to view the created table. 
select * from Users; 

Thanks, 
Aravindraja T 


Ilya Bo
Replied On June 26, 2017 11:31 AM

Thank you for your answer!One more thing I would like to clarify:How to specify Avro Schema (.avsc file) correctly? 

Aravindraja Thinakaran [Syncfusion]
Replied On June 27, 2017 09:02 AM

Hi Ilya, 

You can specify a custom Avro schema (.avsc file) using Scala API and access it using Spark SQL as usual. Please follow the below procedure. 
 
Step 1: Create Spark table by specifying Avro schema using Spark Scala tab in Big Data Studio by running the following script, 
 
Step 2: Access the table as usual using Spark SQL. 
            
Note: 
It seems there is some limitation in specifying custom Avro schema in Spark SQL API , so we provided solution by using Scala API to specify a custom schema. 
 
Thanks, 
Aravindraja T. 


CONFIRMATION

This post will be permanently deleted. Are you sure you want to continue?

Sorry, An error occured while processing your request. Please try again later.

You are using an outdated version of Internet Explorer that may not display all features of this and other websites. Upgrade to Internet Explorer 8 or newer for a better experience.

;