We use cookies to give you the best experience on our website. If you continue to browse, then you agree to our privacy policy and cookie policy. (Last updated on: November 16, 2018).
Unfortunately, activation email could not send to your email. Please try again.
Syncfusion Feedback

Syncfusion Big Data: query data from HDFS in Avro format

Thread ID:





131126 Jun 22,2017 10:41 AM UTC Jun 27,2017 01:02 PM UTC Big Data Platform 3
Tags: General
Ilya Bo
Asked On June 22, 2017 10:41 AM UTC

I have data which stored in HDFS in Avro format.
How can I query it via Syncfusion Big Data Studio by Spark SQL?


Aravindraja Thinakaran [Syncfusion]
Replied On June 23, 2017 07:20 AM UTC

Hi Ilya, 

Thanks for contacting Syncfusion support. 

Please check with below steps to access AVRO files which available in HDFS using SparkSQL from Big Data Studio. 

Step 1: Download and extract the spark-avro_2.11-3.2.0.jar file and copy the jar to below location. 
<Install Drive>:\Syncfusion\BigData\<Install Drive>\BigDataSDK\SDK\Spark\jars\ 
Step 2: Restart Spark Thrift server service from Service Manager. 

Step 3: Use “/Data/Spark/Resources/Users.avro” as input file from HDFS to create table in Spark SQL using below command. 
CREATE TABLE Users USING com.databricks.spark.avro OPTIONS (path "/Data/Spark/Resources/Users.avro"); 

Step 4: After table created use below command to view the created table. 
select * from Users; 

Aravindraja T 

Ilya Bo
Replied On June 26, 2017 03:31 PM UTC

Thank you for your answer!One more thing I would like to clarify:How to specify Avro Schema (.avsc file) correctly? 

Aravindraja Thinakaran [Syncfusion]
Replied On June 27, 2017 01:02 PM UTC

Hi Ilya, 

You can specify a custom Avro schema (.avsc file) using Scala API and access it using Spark SQL as usual. Please follow the below procedure. 
Step 1: Create Spark table by specifying Avro schema using Spark Scala tab in Big Data Studio by running the following script, 
Step 2: Access the table as usual using Spark SQL. 
It seems there is some limitation in specifying custom Avro schema in Spark SQL API , so we provided solution by using Scala API to specify a custom schema. 
Aravindraja T. 


This post will be permanently deleted. Are you sure you want to continue?

Sorry, An error occured while processing your request. Please try again later.

Please sign in to access our forum

This page will automatically be redirected to the sign-in page in 10 seconds.

Warning Icon You are using an outdated version of Internet Explorer that may not display all features of this and other websites. Upgrade to Internet Explorer 8 or newer for a better experience.Close Icon