SQL Server Metadata Succinctly^®
by Joseph D. Booth

CHAPTER 5

Tables and Columns

Although we covered the information schema views back in Chapter 2, SQL Server provides additional views and procedures we can use to view our tables and columns in more depth. In this chapter, we will look at the sys.tables and various sys column views to provide some table and column analysis.

Note: Many of these views are derived from sys.objects. Every “object” in a SQL database (tables, procedures, triggers, etc.) has an object ID associated with it, and sys objects hold the information about the object. The OBJECT_NAME() SQL function returns an object name from the object_id parameter.

Tables

We can join the sys.tables view with other views to determine information about our table design. This allows us to perform additional table analysis beyond the basic information schema views. When referencing the sys.tables view, we can use the object_schema_name and object_name SQL functions to create a table name (schema + table name).

sys.identity_columns (starting with SQL 2008)

This table holds information about tables that have identity columns present.

Code Listing 27: Identity columns

---------------------------------------------

-- Script: Identity_columns.sql

-- List all identity columns

---------------------------------------------

SELECT

OBJECT_SCHEMA_NAME(st.object_id)+'.'+st.name AS [TableName],

ic.name AS KeyName,

t.name AS dataType,

ic.seed_value,ic.increment_value,

isNull(ic.last_value,0) AS Last_Value

FROM sys.tables st

JOIN sys.identity_columns ic ON ic.object_id=st.object_id

JOIN sys.types t ON t.system_type_id=ic.system_type_id

ORDER BY [TableName]

This listing shows tables that have an Identity column, along with the seed value and increment for the column. If the table has rows, the Last_value column will report the last seed number.

Tip: You can use the DBCC CHECKIDENT ('[TableName]', RESEED, 0) command to reset an identity column’s value.

sys.computed_columns (starting with SQL 2008)

This view allows you to get the definitions for any computed columns in the database. The definition will be a SQL expression that is used to provide the column value.

Code Listing 28: Computed columns

---------------------------------------------

-- Script: Computed_Columns.sql

-- List all computed columns

---------------------------------------------

SELECT

OBJECT_SCHEMA_NAME(st.object_id)+'.'+st.name AS [TableName],

cc.name AS KeyName,

t.name AS dataType,

cc.definition AS [Column_Definition],cc.is_persisted

FROM sys.tables st

JOIN sys.computed_columns cc on cc.object_id=st.object_id

JOIN sys.types t on t.system_type_id=cc.system_type_id

ORDER BY [TableName]

Note: If a computed column is persisted, the value will be stored on disk, and can be used for indexing, checking constraints, etc. It will be updated when the data is updated. If it is not persisted, the value will be virtual and will be computed every time the column is referenced.

sys.default_constraints (starting with SQL 2008)

A database table may have a default value to provide during an INSERT when the corresponding field is NULL. The following script lists all the default constraints and the table and columns they are found in.

Code Listing 29: Default constraints

---------------------------------------------

-- Script: Default_Constraints.sql

-- List all default constraints

---------------------------------------------

SELECT object_schema_name(dc.parent_object_id)+'.'+

object_name(dc.parent_object_id) as TableName,

c.name as ColumnName,

dc.definition

FROM sys.default_constraints AS dc

INNER JOIN sys.columns AS c

ON dc.parent_object_id = c.object_id

AND dc.parent_column_id = c.column_id

ORDER BY tableName,columnName

Note: If a column has a default of NewID() to generate a new UniqueIdentifier and this column is part of an index, you should consider using NewSequentialId() instead. Because NewID() generates a random unique identifier, it is likely to cause an index split, as it attempts to insert the record into the index. (The NewSequentialID() function generates a unique identifier higher than the prior one.) So while you can improve creation performance with NewSequentialID, it is also a privacy risk to use sequential (hence predictable) identifiers.

sys.index_columns (starting with SQL 2008)

This script will identify all the columns that are used as indexes in the various tables. This can be a handy way to determine whether a new search you want to add is already indexed in the table (improving performance).

Code Listing 30: Index columns

---------------------------------------------

-- Script: Index_columns.sql

-- List all indexed columns

---------------------------------------------

SELECT st.name as TableName

,i.name as IndexName

,COL_NAME(ic.object_id,ic.column_id) AS ColumnName

FROM sys.indexes AS i

INNER JOIN sys.index_columns AS ic ON i.object_id = ic.object_id AND i.index_id = ic.index_id

JOIN sys.tables st on st.object_id=ic.object_id

ORDER BY [TableName],ic.index_column_id,ColumnName

You can add the WHERE expression ic.index_column = 1 to find only those columns that are the first expression in the index. You could also specify all columns you need to retrieve to see whether there is a covering index for your search parameters.

sys.key_constraints (starting with SQL 2008)

This script is used to identify the columns that are being used as primary keys in a table. It shows the table and index name, as well as the column(s) making up the key.

Code Listing 31: Key constraints

---------------------------------------------

-- Script: Key_constraints.sql

-- List all key constraint columns

---------------------------------------------

select object_schema_name(tb.object_id)+'.'+tb.name as [TableName],

object_name(kc.object_id) as IndexName,

sc.name as ColumnName

from sys.tables tb

join sys.key_constraints kc on kc.parent_object_id=tb.object_id

join sys.index_columns ic

on ic.object_id=kc.parent_object_id and kc.unique_index_id=ic.index_id

join sys.columns sc on sc.object_id=ic.object_id and ic.index_column_id=sc.column_id

where tb.type='U' and kc.type='PK'

order by tableName,ic.index_column_id

sys.check_constraints (starting with SQL 2008)

A check constraint is a SQL expression that is applied to a column to validate the type of data allowed in that column. For example, you might use the following LIKE expression to ensure a zip code field only contains five digits.

zip LIKE '[0-9][0-9][0-9][0-9][0-9]'

Code Listing 32 will list all columns in the database that have check constraints applied to them.

Code Listing 32: Check constraints

---------------------------------------------

-- Script: Check_constraints.sql

-- List all check constraints

---------------------------------------------

SELECT object_schema_name(dc.parent_object_id)+'.'+

object_name(dc.parent_object_id) as TableName,

c.name as ColumnName,

dc.definition

FROM sys.check_constraints AS dc

INNER JOIN sys.columns AS c

ON dc.parent_object_id = c.object_id

AND dc.parent_column_id = c.column_id

ORDER BY tableName,columnName

sys.masked_columns (starting with SQL 2016)

In SQL 2016, a new feature called Dynamic Data Masking was added, which provides the ability to apply masked to a column, so users querying the data will not see the actual column contents. To create a masked column, you add the MASKED WITH (expression) to the table column. For example, to make an email column, you could use the following.

Email VARCHAR(150) MASKED WITH (Function = ‘email()’)

When the Email column appears in a query, it will be displayed as [email protected]. It is a handy feature for simple security in a database. You can identify all the masked columns in a database using the following query.

Code Listing 33: Masked columns

---------------------------------------------

-- Script: Masked_Columns.sql

-- List all masked columns

---------------------------------------------

if SERVERPROPERTY('ProductMajorVersion')>='13'

SELECT object_schema_name(tb.object_id)+'.'+tb.name as [TableName],

c.name AS column_name, c.masking_function

FROM sys.masked_columns AS c

JOIN sys.tables AS tb ON c.object_id = tb.object_id

WHERE is_masked = 1

ELSE

SELECT 'Requires SQL 2016 or higher'

Putting it all together

You can combine the various queries to produce a report of tables and columns in your system, along with column information. Code Listing 34 shows a table/column reporting script.

Code Listing 34: Table/column detail script

---------------------------------------------

-- Script: Column_Report.sql

-- List column details

---------------------------------------------

SELECT object_schema_name(tb.object_id)+'.'+

object_name(tb.object_id) as TableName,

c.name as ColumnName,

isNull(pk.PK,'') as IsKey,

isNull(ic.IdentityColumn,'') as Identity_Column,

isNull(cc.CheckConstraint,'') as Check_Constraint,

isNull(dc.DefaultConstraint,'') as Default_Constraint

FROM sys.tables as tb

JOIN sys.columns AS c on c.object_id=tb.object_id

LEFT JOIN (

select ic.object_id,ic.index_column_id,'PRIMARY' as PK

from sys.tables tb

join sys.key_constraints kc on kc.parent_object_id=tb.object_id

join sys.index_columns ic on ic.object_id=kc.parent_object_id

and kc.unique_index_id=ic.index_id

where tb.type='U' and kc.type='PK'

) pk on pk.object_id=tb.object_id and pk.index_column_id=c.column_id

LEFT JOIN (

SELECT ic.object_id,ic.name,ic.name+' identity('+

cast(ic.seed_value as varchar(10))+','+

cast(ic.increment_value as varchar(10))+') ' as IdentityColumn

from sys.tables st

join sys.identity_columns ic on ic.object_id=st.object_id

) ic ON tb.object_id=ic.object_id and c.name=ic.name

LEFT JOIN (

SELECT dc.parent_object_id,dc.parent_column_id,

dc.definition as CheckConstraint

FROM sys.check_constraints AS dc

INNER JOIN sys.columns AS c ON dc.parent_object_id = c.object_id AND dc.parent_column_id = c.column_id

) cc on cc.parent_object_id=tb.object_id and cc.parent_column_id=c.column_id

LEFT JOIN (

SELECT dc.parent_object_id,dc.parent_column_id,

dc.definition as DefaultConstraint

FROM sys.default_constraints AS dc

INNER JOIN sys.columns AS c ON dc.parent_object_id = c.object_id AND dc.parent_column_id = c.column_id

) dc on dc.parent_object_id=tb.object_id and dc.parent_column_id=c.column_id

ORDER BY tableName,column_id

When this script is run, it will produce a report of all table names and columns, and indicate which columns are primary keys, identity columns, constraints, and so on.

Searching for deprecated columns

In early versions of SQL, there were text and image columns called text, nText, and image. These columns (while still supported) were deprecated in SQL Server 2005. The following script allows you to search for deprecated column types and indicates the appropriate replacement column type.

Code Listing 35: Deprecated columns

---------------------------------------------

-- Script: Deprecated_columns.sql

-- List all columns with deprecated types

---------------------------------------------

select t.name,c.name as ColName, 'Deprecated: '+

CASE

WHEN tp.name = 'text' then 'Replace [text] with varchar(max)'

WHEN tp.name = 'ntext' then 'Replace [ntext] with nvarchar(max)'

WHEN tp.name = 'image' then 'Replace [image] with varbinary(max)'

ELSE 'Table contains Text,nText, or Image fields' END as Msg

from sys.columns c

join sys.tables t on c.object_id=t.object_id

join sys.types tp on tp.user_type_id=c.user_type_id

where t.is_ms_shipped=0 and tp.name in ('text','ntext','image')

If you are using any of these column data types, you should plan on changing the data type to keep current with SQL Server.

Numeric columns

SQL Server and most database servers perform much better with integer values, rather than a numeric data type. If a decimal or numeric column has a scale of 0 (no decimal place), you should consider replacing that column with the equivalent integer column. Code Listing 36 will search for any numeric columns with a zero scale and suggest the equivalent integer column type.

Code Listing 36: Suggest integer columns

---------------------------------------------

-- Script: Suggest_integers.sql

-- Convert numeric columns to integers

---------------------------------------------

SELECT tb.table_schema, tb.table_name, tc.column_name as colname,

CASE

WHEN numeric_precision <= 2

THEN 'Convert '+tc.Data_type+'('+

CAST(numeric_precision as varchar)+',0) to tinyint data type'

WHEN numeric_precision <= 4

THEN 'Convert '+tc.Data_type+

'('+CAST(numeric_precision as varchar)+',0) to smallint data type'

WHEN numeric_precision <= 9

THEN 'Convert '+tc.Data_type+

'('+CAST(numeric_precision as varchar)+',0) to int data type'

WHEN numeric_precision <= 18

THEN 'Convert '+tc.Data_type+

'('+CAST(numeric_precision as varchar)+',0) to bigint data type'

ELSE 'Consider using an integer data type'

END as Msg

FROM INFORMATION_SCHEMA.columns tc

JOIN INFORMATION_SCHEMA.tables tb ON

tb.table_name = tc.table_name and tb.table_schema = tc.table_schema

WHERE tb.Table_Type='BASE TABLE'

AND tc.data_type IN('numeric','decimal')

AND tc.numeric_scale = 0

AND tc.numeric_precision <= 18

ORDER BY tb.table_schema,tb.table_name,tc.column_name

Approximate column types

The float and real column types in a database are approximations, rather than exact values. Generally, graphic applications use floats for smaller storage requirement, and can accept the loss of precision. So, while a float or real data type might be necessary, you should review your usage to make sure it is necessary.

Code Listing 37 searches for float and real columns in your database tables.

Code Listing 37: Real and float columns

---------------------------------------------

-- Script: SearchFloatColumns.sql

-- Identify float and real columns

---------------------------------------------

SELECT schema_name(o.schema_id) AS SchemaName,

o.name AS TableName,c.name AS columnName,

t.Name AS ColumnType

FROM sys.all_columns c

JOIN sys.objects o ON c.object_id=o.object_id

JOIN sys.types t ON t.user_type_id=c.user_type_id

WHERE t.name IN ('float','real') AND o.type='U'

You can read this article to see if floating point arithmetic is necessary for your application.

Unexpected columns

There are often columns in a database table that hold standard information (such as phone numbers, email addresses, and state codes). Sometimes, these columns have unexpected sizes. (For example, one system used a varchar(max) to store phone numbers).

Max characters

Code Listing 38 shows all columns using varchar or nvarchar max, even though the column name suggests it is not needed. You should adjust the list of searched column names for more common suggestions, based on your knowledge of your application.

Code Listing 38: Max columns check

---------------------------------------------

-- Script: VarChar_Max_check.sql

-- Maximum columns that might not be needed

---------------------------------------------

SELECT schema_name(o.schema_id) AS SchemaName,

o.name AS TableName,c.name AS columnName,

t.Name+'(max)' AS ColumnType

FROM sys.all_columns c

JOIN sys.objects o on c.object_id=o.object_id

JOIN sys.types t on t.user_type_id=c.user_type_id

WHERE t.name like '%varchar%' and c.max_length < 0

AND o.schema_id <> 4

AND (c.name LIKE '%phone%' or c.name LIKE '%address%')

Nondate columns

Another scenario seen in databases is columns that are holding dates, but not using a date column type. Code Listing 39 searches for such a column type.

Code Listing 39: Date columns using other data types

-------------------------------------------------------

-- Script: Date_columns_check.sql

-- Date values possibly stored in nondate columns

-------------------------------------------------------

SELECT schema_name(o.schema_id) AS SchemaName,

o.name AS TableName,c.name AS columnName,

t.Name AS ColumnType

FROM sys.all_columns c

JOIN sys.objects o on c.object_id=o.object_id

JOIN sys.types t on t.user_type_id=c.user_type_id

WHERE t.name not like '%date%'

AND (c.name LIKE '%date%')

AND o.schema_id <> 4

ORDER BY SchemaName,TableName,columnName

Note that in both these queries, we are filtering out schema_id 4 (the sys schema).

Summary

You can use the various views to optimize your columns, hopefully identifying problematic columns and, where possible, simplifying the data types. SQL Server is a powerful tool, and by giving it the best column types and size, you can improve database integrity and performance.

Build apps 2X faster

using Syncfusion Essential Studio^® suite

1800+ high-performance UI components.
Includes popular controls such as Grid, Chart, Scheduler, and more.
24x5 unlimited support by developers.

Get Your Free Trial Now

Tables and Columns

Tables

sys.identity_columns (starting with SQL 2008)

sys.computed_columns (starting with SQL 2008)

sys.default_constraints (starting with SQL 2008)

sys.index_columns (starting with SQL 2008)

sys.key_constraints (starting with SQL 2008)

sys.check_constraints (starting with SQL 2008)

sys.masked_columns (starting with SQL 2016)

Putting it all together

Searching for deprecated columns

Numeric columns

Approximate column types

Unexpected columns

Max characters

Nondate columns

Summary

DISCLAIMER: Web reader is currently in beta. Please report any issues through our support system. PDF and Kindle format files are also available for download.