left-icon

SQL Server Analysis Services Succinctly®
by Stacia Misner

Previous
Chapter

of
A
A
A

CHAPTER 1

Introduction to SQL Server Analysis Services

Introduction to SQL Server Analysis Services


SQL Server Analysis Services is one of several components available as part of Microsoft SQL Server 2012 that you can use to develop a business intelligence analytic solution. In this introduction to SQL Server Analysis Services, I explain the concept of business intelligence and all available options for architecting a business intelligence solution. I also review the process of developing an Analysis Services database at a high level and introduce the tools you use to build, manage, and query Analysis Services.

What Is Business Intelligence?

Business intelligence means different things to different people. Regardless of how broadly or narrowly the term is used, a globally accepted concept is that it supports the decision-making process in organizations. In short, people at all levels of an organization must gather information about the events occurring in their business before making a decision that can help that business make or save money.

A common problem in many businesses is the inability of the operational systems gathering details about business events to facilitate the information-gathering process and consequently the decision-making process is impeded. When the only source of information is an operational system, at worst people rely on gut instinct and make ill-informed decisions because they cannot get the information they need, while at best people have tools or other people to help them compile the needed data, but that process takes time and is tedious.

Most business applications store data in relational systems, which anyone can query if they have the right tools, skills, and security clearance. Why then is it necessary to move the data into a completely different type of database? To understand this requirement and why Analysis Services is included in a business intelligence solution, it’s helpful to compare the behavior of a relational engine like SQL Server with an Online Analytical Processing (OLAP) engine like Analysis Services. First, let’s consider the three types of questions that are important to decision makers as they analyze data to understand what’s happening in the business:

  • Summarization. Users commonly want to summarize information for a particular range of time, such as total sales across a specified number of years.
  • Comparison. Users want to answer questions that require comparative data for multiple groups of information or time periods. For example, they might want to see total sales by product category. They might want to break down this data further to understand total sales by product category or by all months in the current year.
  • Consolidation. Users often also have questions that require combining data from multiple sources. For example, they might want to compare total sales with the forecasted sales. Typically, these types of data are managed in separate applications.

Note: For the purposes of this book, I use summary, comparison, and consolidation questions to represent the business requirements for a business intelligence solution to build. Although the scenario I discuss is extremely simple, the same principles that I describe here also apply to real-world scenarios for which decision-makers have many more questions that require answers that the data can answer, if only it were structured in a better way..

Each of these types of queries can be problematic when the data is available in a relational engine only for the following reasons:

  • Queries for decision-making rely on data stored in the same database that is being used to keep the business running. If many users are executing queries that require the summarization of millions of rows of data, a resource contention problem can arise. A summarized query requires lots of database resources and interferes with the normal insert and update operations that are occurring at the same time as the business operations.
  • Data sources are often focused on the present state. Historical data is archived after a specified period of time. Even if it is not archived completely, it might be kept at a summarized level only.
  • Calculations often cannot be stored in the relational database because the base values must be aggregated before calculations are performed. For example, a percent margin calculation requires the sum of sales and the sum of costs to be calculated first, total costs to be subtracted from total sales next, and finally the result to be derived by dividing it by the total sales. Whether the logic is relatively simple, as with a percent margin calculation, or complex as with a weighted allocation for forecasting, that logic is not stored in the relational engine and must be applied at query time. In that case, there is no guarantee that separate users using different tools to gather data will construct the calculation in identical ways.
  • Relational storage of data often uses a structure called third normal form, which spreads related data across multiple tables. As a result, the retrieval of data from these tables requires complex queries that can be difficult to write and can contain many joins that might cause queries to run slowly.

An OLAP engine solves these problems in the following ways:

  • The use of a separate data source for querying reduces resource contention. Of course, you can maintain a replica of a relational database that you dedicate to reporting, but there are other reasons to prefer OLAP over relational.
  • You can retain historical data in an OLAP database that might otherwise be eliminated by overwrites or archiving the source system. Again, you can resolve this problem by creating a relational data mart or data warehouse, but there are still other reasons to implement OLAP.
  • A more significant benefit of OLAP is the centralization of business logic to ensure all users get the same answer to a particular query regardless of when the query is run or the tool used to execute the query.
  • The storage mechanism used by Analysis Services is designed for fast retrieval of data. If you prefer to write a query rather than use a query builder tool, many times the queries are shorter and simpler (once you learn the query syntax for Analysis Services, MDX).
  • OLAP databases store data in binary format, resulting in smaller files and faster access to the data.
  • Last but not least, OLAP databases provide users with self-service access to data. For example, a Microsoft Excel user can easily connect to an Analysis Services cube and browse its data by using pivot charts or pivot tables.

Architecture Options

Multidimensional or Tabular?

There are several different ways that you can architect Analysis Services:

  • Prototype. This is the simplest architecture to implement. In this case, you install Analysis Services on a server, and then create and process a database to load it with data. Your focus is on a single data load to use in a proof of concept and therefore you do not implement any data refresh processes as part of the architecture.
  • Personal or team use. If you have a single data source with a relatively simple structure and small volumes of data, and if you have no need to manage historical changes of the data (also known as slowly changing dimensions), you can implement Analysis Services and add a mechanism for refreshing your Analysis Services database on a periodic basis, such as nightly or weekly.
  • Department or enterprise use. As the number of users requiring access to the database grows, or the number of data sources or complexity of the data structure increases, you need to set up a more formal architecture. Typically, this requires you to set up a dedicated relational source for Analysis Services, such as a subject-specific data mart or a data warehouse that houses multiple data marts or consolidates data from multiple sources. In this scenario, you implement more complex extract, transform, and load (ETL) processes to keep the data mart or data warehouse up-to-date and also to keep the Analysis Services database up-to-date. If you need to scale out the solution, you can partition the Analysis Services database.

Note: Although the focus of this book is the multidimensional server mode for Analysis Services, the architecture for an environment that includes Analysis Services in tabular server mode is similar. Whereas a multidimensional database requires relational data sources, a tabular database can also use spreadsheets, text data, and other sources. You can use the same client tools to query the databases.

In the prototype architecture, your complete environment can exist on a single server, although you are not required to set it up this way. It includes a relational data source, an Analysis Services instance, and a client tool for browsing the Analysis Services database, as shown in Figure 1. The relational data source can be a SQL Server, DB2, Oracle, or any database that you can access with an OLE DB driver. For prototyping purposes, you can use the Developer Edition of Analysis Services, but if you think the prototype will evolve into a permanent solution, you can use the Standard, Business Intelligence, or Enterprise Edition, depending on the features you want to implement as described in Table 1. For browsing the prototype database, Excel 2007 or higher is usually sufficient.

Prototype Architecture

  1. Prototype Architecture
  1. Feature Comparison by Edition

Feature

Standard Edition

Business Intelligence Edition

Developer and Enterprise Editions

Account Intelligence

Yes

Yes

Yes

Actions

Yes

Yes

Yes

Advanced Dimensions (Reference, Many-to-Many)

Yes

Yes

Yes

Advanced Hierarchy Types (Parent-Child, Ragged)

Yes

Yes

Yes

Aggregations

Yes

Yes

Yes

Binary and Compressed XML Transport

Yes

Yes

Yes

Custom Assemblies

Yes

Yes

Yes

Custom Rollups

Yes

Yes

Yes

Dimension and Cell Level Security

Yes

Yes

Yes

Direct Writeback

No

Yes

Yes

Drillthrough

Yes

Yes

Yes

Hierarchies

Yes

Yes

Yes

High Availability

Yes

Yes

Yes

KPIs

Yes

Yes

Yes

Linked Measures and Dimensions

No

Yes

Yes

MDX Queries and Scripts

Yes

Yes

Yes

Measure Expressions

No

Yes

Yes

MOLAP, ROLAP, and HOLAP Storage Modes

Yes

Yes

Yes

Multiple Partitions

Up to 3

Yes

Yes

Perspectives

No

Yes

Yes

Proactive Caching

No

Yes

Yes

Programmability (AMO, AMOMD.NET, OLEDB, XML/A, ASSL)

Yes

Yes

Yes

Push-Mode Processing

No

Yes

Yes

Role-Based Security Model

Yes

Yes

Yes

Scalable Shared Databases (Attach/Detach, Read Only)

No

Yes

Yes

Scalable String Storage

Yes

Yes

Yes

Semi-additive Measures

LastChild only

Yes

Yes

Time Intelligence

Yes

Yes

Yes

Translations

Yes

Yes

Yes

Writeback Cells

Yes

Yes

Yes

Writeback Cube

Yes

Yes

Yes

Writeback Dimensions

No

Yes

Yes

For a personal or team solution, you introduce automation to keep data current in Analysis Services. You use the same components described in the prototype architecture: a data source, Analysis Services, and a browsing tool. However, as shown in Figure 2, you add Integration Services as an additional component to the environment. Integration Services uses units called packages to describe tasks to execute. You can then use a scheduled process to execute one or more packages that update the data in the Analysis Services database. Excel is still a popular choice as a browsing tool, but you might also set up Reporting Services to provide access to standard reports that use Analysis Services as a data source.

Personal or Team Architecture

  1. Personal or Team Architecture

To set up an architecture for organizational use, as shown in Figure 3, you introduce a data mart or data warehouse to use as a source for the data that is loaded into Analysis Services. Integration Services updates the data in the data mart on a periodic basis and then loads data into Analysis Services from the data mart. In addition to Excel or Reporting Services as client tools, you can also use SharePoint business intelligence features, which include Excel Services, SharePoint status indicators and dashboards, or PerformancePoint scorecards and dashboards.

Organizational Architecture

  1. Organizational Architecture

Development, Management, and Client Tools

If you are responsible for creating or maintaining an Analysis Services database, you use the following tools:

  • SQL Server Data Tools (SSDT)
  • SQL Server Management Studio (SSMS)
  • A variety of client tools

SSDT is the environment you use to develop an Analysis Services database. Using this tool, you work with a solution that contains one or more projects, just like you would when developing applications in Visual Studio.

You can use SSMS to configure server properties that determine how the server uses system resources. You can also use Object Explorer to see the databases deployed to the server and explore the objects contained within each database. Not only can you view an object’s properties, but in some cases you can also make changes to those properties. Furthermore, you can create scripts of an object’s definition to reproduce it in another database or on another server.

SSMS also gives you a way to quickly check data either in the cube itself or in individual dimensions. You can use the MDX query window to write and execute queries that retrieve data from the cube. A graphical interface is also available for browsing these objects without the need to write a query.

Another feature in SSMS is the XML for Analysis (XMLA) query window in which you write and execute scripts. You can use XMLA scripts to create, alter, or drop database objects and also to process objects, which is the way that data is loaded into an Analysis Services database. You can then put these scripts into Integration Services packages to automate their execution or you can put them into SQL Server Agent jobs. However, you are not required to use scripts for processing. You can instead manually process objects in SSMS whenever necessary or create Integration Services packages to automate processing.

As part of the development process, you should use the client tools that your user community is likely to use to ensure the browsing experience works as intended. In this chapter, I explain the choice of client tools available from Microsoft, but there are also several third-party options to consider, and of course you can always create a custom application if you want users to have specific functionality available. The Microsoft business intelligence stack includes the following tools:

  • Excel. This is a very common choice for browsing a cube since users are often already using Excel for other reasons and likely have some experience with pivot tables. Excel provides an easy-to-use interface to select dimensions for browsing, as shown in Figure 4, and also offers advanced functionality for filtering, sorting, and performing what-if analysis.

Cube Browsing with Excel

  1. Cube Browsing with Excel
  • Reporting Services. This is an option when users need to review information but are doing less exploration of the data. These users access the data by using pre-built static reports, as shown in Figure 5.

Report with Analysis Services Data Source

  1. Report with Analysis Services Data Source
  • SharePoint. You can use Analysis Services as a data source for dashboard filters, as shown in Figure 6.

Dashboard Filter with Analysis Services Data Source

  1. Dashboard Filter with Analysis Services Data Source
  • PerformancePoint Services. You can create scorecards, as shown in Figure 7, and dashboards using Analysis Services as a data source.

Analysis Services Key Performance Indicators in a Scorecard

  1. Analysis Services Key Performance Indicators in a Scorecard

Database Development Process

Before diving into the details of Analysis Services database development, let’s take a look at the general process:

  1. Design a dimension model.
  2. Develop dimension objects.
  3. Develop cubes for the database.
  4. Add calculations to the cube.
  5. Deploy the database to the server.

First, you start by designing a dimensional model. You use either an existing dimensional model that you already have in a data mart or data warehouse, or you define the tables or views that you want to use as sources, set up logical primary keys, and define relationships to produce a structure that’s very similar to a dimensional model that you would instantiate in the relational database. I describe this step in more detail in Chapter 2, “Designing the dimensional model.”

Once the dimensional model is in place, you then work through the development of the dimension objects. When browsing a cube, you use dimensions to “slice and dice” the data. You will learn more about this step in the process in Chapter 3, “Developing dimensions.”

The next step is to develop one or more cubes for the database. This is often an iterative process where you might go back and add more dimensions to the database and then return to do more development work on a cube. I will explain more about this in Chapter 4, “Developing cubes.”

Eventually you add calculations to the cube to store the business logic in the cube for data that’s not available in the raw data. There are specialized types of calculations to produce sets of dimension members and key performance indicators. You will learn how to work with all these types of calculations in Chapter 5, “Enhancing cubes with MDX.”

During and after the development work, you deploy the database to the server and process objects to load them with data. It’s not necessary to wait until you’ve completed each step in the development process to deploy. It’s very common to develop a dimension, deploy it so that you can see the results, go back and modify the dimension, and then deploy again. You continue this cycle until you are satisfied with the dimension, and then you are ready to move on to the development of the next dimension.

Anatomy of an Analysis Services Project

To start the multidimensional database development process in SSDT, you create a new project in SSDT. Here you can choose from one of the following project types:

  • Analysis Services Multidimensional and Data Mining Project. You use this project type to build a project from scratch. The project will initially be empty, and then you build out each object individually, usually using wizards to get started quickly.
  • Import from Server (Multidimensional and Data Mining). If an Analysis Services database is already deployed to the server, you can import the database objects and have SSDT reverse engineer the design and create all the objects in the project.

Whether you start with an empty Analysis Services project or import objects from an existing Analysis Services database, there are several different types of project items that you have in an Analysis Services project:

  • Data Source. This item type defines how to connect to an OLE DB source that you want to use. If you need to change a server or database name, then you have only one place to make the change in SSDT.
  • Data Source View (DSV). The data source view represents the dimensional model. Everything you build into the Analysis Services database relies on the definitions of the data structures that you create in the data source view.
  • Cube. An Analysis Services project has at least one cube file in it. You can create as many as you need.
  • Dimension. Your project must have at least one dimension, although most cubes have multiple dimensions.
  • Role. You use roles to configure user access permissions. I explain how to do this in Chapter 6, “Managing Analysis Services databases.” It’s not necessary to create a role in SSDT, however. You can add a role later in SSMS instead.
Scroll To Top
Disclaimer
DISCLAIMER: Web reader is currently in beta. Please report any issues through our support system. PDF and Kindle format files are also available for download.

Previous

Next



You are one step away from downloading ebooks from the Succinctly® series premier collection!
A confirmation has been sent to your email address. Please check and confirm your email subscription to complete the download.