Postgres Succinctly^®
by Peter Shaw

CHAPTER 7

Postgres-Specific SQL

In this chapter we'll look at some of the SQL commands that can only be found in Postgres. There will be some overlap between Postgres and other database systems, and all services do in some way have their own methods of achieving some of the things we'll be looking at here. However, I'll be picking things that are specific to Postgres, and even if they are available on other platforms, I've picked them mainly because Postgres performs them so well that I would never consider using another platform or database engine to work with that feature.

Let's kick off with…

Data Type Conversion

All database engines can convert from one data type to another. It's a required process since database engines are designed to handle various types of data.

If all databases have to handle this functionality, then why is Postgres so special?

Well, most databases implement data conversion in a very long-winded way. Assuming that we have a table with an integer column and a timestamp column in it, let me show you what I mean:

SELECT CAST(column1 AS varchar(30)), CAST(column2 AS DATE) FROM mytable;

This statement will select the values in column1 and column2 from a table called mytable, and in the process it will use the cast function to convert the value to a varchar(30) and a date at the same time.

To most developers this is not really a problem; it's the norm. We nest functions, and we string things together in a fluent manner, so seeing something like the previous statement really doesn't shock us.

Unfortunately to many developers, the previous code is rather messy and hard to maintain. The authors of Postgres knew this, so they decided to add this functionality a different way:

SELECT column1::varchar(30), column2::date FROM mytable;

As you can see, this statement reads much easier than the first one. It's also more intuitive and easy to recognize exactly what's going on.

This on-the-fly data conversion is not restricted to SELECT either; you can also use it with INSERT:

INSERT INTO mytable(column1, column2) VALUES('100'::integer, '2013-05-30T19:30'::timestamp)

The same is also true for functions:

SELECT myfunction(column1::varchar(30))

Anywhere you would use a normal column or variable specification in your queries, you can use data type casting to convert values on-the-fly.

There is one thing to notice though—it's not magic. For example, it will still create an error if you try to convert “hello” to an integer, but as long as you’re handling errors in some fashion, then you can watch for and handle this scenario.

The Postgres manual page detailing this can be found at:

http://www.postgresql.org/docs/9.2/static/sql-expressions.html#SQL-SYNTAX-TYPE-CASTS

Data Validation Constraints

How many times have you had to write the same boilerplate business logic to make sure that a set of values fall into the same range?

Take for example two product price values.

These values would generally be represented by a standard decimal number value, one might be a full price, and another might be a discounted price.

In each case the standard data types will still allow negative numbers, and without business logic in your application code, you can't enforce the discount code to always be a lesser value than the full price.

Postgres can solve these problems directly in the database by using something called a check constraint.

Check constraints are defined at table creation time, just like normal SQL constraints such as not null. Taking our price example, we could define a table as follows (there is no need to enter this right now; it's for illustration against regular SQL only):

create table pricetable(

record_id serial primary key,

fullprice number(12,2),

discountprice number(12,2)

);

As you can see, there's nothing special about it; it's a simple table definition.

If we now rewrite our DDL to enforce our rules:

create table pricetable(

record_id serial primary key,

fullprice number(12,2) check(fullprice > 0),

discountprice number(12,2) check(discountprice > 0 AND discountprice < fullprice)

);

You'll see that it's as simple as adding a check to the column definition line.

Just by adding those two simple check constraints to our table, we've now enforced those business rules directly on our table. Postgres will now refuse to allow data that violates these business rules to be inserted into this table.

We can also go one step further.

The checks we've added are known as column checks, and will be evaluated only when there is a change on that particular column. You can also specify that a check be placed on the table itself.

create table pricetable(

record_id serial primary key,

fullprice numeric(12,2) check(fullprice > 0),

discountprice numeric(12,2) check(discountprice > 0),

check(discountprice < fullprice)

);

When you add a check to a table in this manner, that check is evaluated when any column in the table is updated, not just the column that the check is attached to.

Taking our example, any time fullprice is changed or has a value inserted, it's checked to make sure that it is greater than 0. The same is also true for discount price. The discount < full is evaluated even if record_id is changed, meaning no matter the update, your data integrity is always re-evaluated.

Checks can also do some complex math across multiple columns.

Take the following example:

create table people(

id bigint primary key,

first character varying(100) not null,

last character varying(100),

email character varying(200),

dateofbirth date not null

)

This is another fairly typical table definition, again nothing special. However, if we now add the following table level check:

create table people(

id bigint primary key,

first character varying(100) not null,

last character varying(100),

email character varying(200),

dateofbirth date not null,

check(

DATE_PART('year', dateofbirth::timestamp) <

DATE_PART('year', now() - interval '18 years')

)

You have a self-validating table that won't allow you to add or change a date of birth such that a person becomes 18 years of age or younger.

You wouldn't rely on this in its entirety. Someone, for example, could easily lie about his or her date of birth among other things, but you begin to see how you can easily start to add an extra layer of validation into your applications.

Check constraints can use any combination of variables, column data, custom functions, and anything else available to a normal SQL query. The only limit in how you combine them is your imagination.

You can find the manual page about data validation constraints at http://www.postgresql.org/docs/9.2/static/ddl-constraints.html.

Postgres Arrays

I've mentioned Postgres array handling a couple of times already, but now it's time to show it off a little bit.

Many database purists would argue that having the capability to put multiple values in a single value field is a bad design decision, and in many cases it does detract from the nature of having fully normalized rows of data in well-designed tables.

However, there are some cases where these multi-value fields can be useful. The thing to remember as I go through this next section is that this is just another tool in the Postgres toolkit.

You don't have to use arrays if they don’t solve the problem you’re trying to solve; it's up to you to decide if they are the right tool for the job.

Like much of the Postgres extended functionality, arrays are defined at table creation time. You can use any of the Postgres types available and turn it into an array just by adding square brackets to the data type.

Take the following example:

create table lotterynumbers(

name character varying(50),

numbers integer[]

)

If you've inserted this statement, then be aware running the next statement immediately after will give you an error due to the table name being the same. You will need to:

drop table lotterynumbers

to delete the first table before trying the second in order to avoid getting this error.

The name column is just a standard string, and the numbers column is an array of undefined length. If we wanted to make the array a fixed size, we simply add that size to the table definition:

create table lotterynumbers(

name character varying(50),

numbers integer[6]

)

Now only six integers can be stored in the field.

Once the table has been defined, you can populate it as follows:

insert into lotterynumbers(name, numbers) values('peter', { 1,2,3,4,5,6 })

Once you have data in an array field, you can easily access it using an index (in a very similar fashion to .NET and many other languages). Something to note here: In my example insert, you'll see that I don't enclose the {} array specification in single quotes. This is the way it's always worked as far as I'm aware. The version (v9.0) that I'm testing this on accepts the line without the quotes, but it appears that newer versions do not.

If you get a syntax error while trying the inserted code, then try repeating it with single quotes around the array specification as follows:

insert into lotterynumbers(name, numbers) values('peter', '{ 1,2,3,4,5,6 }')

Also keep this in mind for future statements using array syntax from this point on. If you get a syntax error, try using single quotes around the specifier.

To get back only the first lottery number:

select name,numbers[1] from lotterynumbers

You can also select a slice of the array:

select name,numbers[3:4] from lotterynumbers

Just as you can select by index, you can also update individual indexes:

update lotterynumbers set numbers[5] = 50;

And again, you can also use slices:

update lotterynumbers set numbers[2:3] = { 7,8 };

Replacing the entire array is just as easy:

update lotterynumbers set numbers = { 9,8,7,6,5,4 };

If you haven’t set any limits on the array, then you can arbitrarily extend it by using the concatenation operator:

update lotterynumbers set numbers = numbers || { 8,9,10 }

As well as the previous examples, you can easily perform array-wide searches.

If you have a fixed-size array, you can do the following:

select * from lotterynumbers where numbers[1] = 10 and numbers[6] = 20

If you have a bigger array, or your array is not a fixed size, you can use the ANY operator, which behaves exactly like the .NET ANY operator in LINQ in that any array element that contains the desired check will be returned.

select * from lotterynumbers where 20 = ANY(numbers)

Notice that the syntax is not quite what you might expect. It's done this way because you’re actually looking to see if anything in the array matches your criteria, as opposed to looking in the array to see if it contains your criteria.

If you suspect that more than one element in your array will contain the value you are looking for, then you can have Postgres return all matches as a subarray using the ALL operator:

select * from lotterynumbers where 20 = ALL(numbers)

You can find much more detailed information, including how to use second and third arrays and negative indexes, on the Postgres manual page at http://www.postgresql.org/docs/9.2/static/arrays.html.

Please note that I've linked the 9.2 version. You may need to adjust the version number to get the correct page.

Inherited Tables

Inherited tables are very simple; so simple, in fact, that you don't actually need to learn anything new to use them other than a single keyword.

Just like with many other unique Postgres features, everything is built on standard SQL constructs. Let's use the idea discussed in previous chapters, and show the SQL that might be used to construct our staff tables. First, our master table:

create table staff(

fullname character varying(100) not null,

wageband character varying(25) not null,

phonenumber character varying(10),

department character varying(25) not null

)

Like many other examples, it's a standard bit of DDL. In .NET, this would be the base class. We would then inherit our other tables from it:

create table itstaff(

languages character varying(50)[],

coursestaken character varying(50)[],

currentproject character varying(25),

systemowned character varying(25)

) inherits(staff)

create table accountsstaff(

projectsowned character varying(50)[],

clientsresponsiblefor int[],

subdepartment int,

officename character varying(50)

) inherits(staff)

If you now use select on those tables, you should see the following:

select * from itstaff

Output pane

select * from accountsstaff

Output pane

You can see immediately that both tables have the same columns as the staff table, as well as the extra columns that were added in each of their own definitions.

Data is also inherited but not quite in the way you might think.

Inserts in the parent table do not show up in the child tables. It's a common misconception to those new to inheritance in Postgres that this is the case.

Data is ONLY inherited up the tree, not down it, so in this case inserting data into staff will not be visible in the itstaff or accountsstaff tables:

insert into staff values(

'Peter Shaw', 'A', '0123456789', 'Technical'

)

select * from staff

Output pane

select * from itstaff

Output pane

select * from accountsstaff

Output pane

However, because data propagates up the tree, performing the following:

insert into itstaff values(

'Peter Shaw 2', 'B', '', 'Development', '{C, Pascal, C++, C#, PHP}','{dev1, dev2, dev3}','Postgres Book','DS'

)

results in this:

select * from itstaff

Output pane

select * from staff

Output pane

This also applies to the accountsstaff table:

insert into accountsstaff values(

'Peter Shaw 3', 'C', '', 'Admin', '{proj1, proj2, proj3}','{1, 2, 3}',20,'North Durham'

)

select * from accountsstaff

Output pane

select * from staff

Output pane

As you can see, this could potentially be very useful in some situations. You most likely would not use the parent table for most data modification operations, but you might in theory use it for reporting.

A read-only table that has every staff member in the organization in it, for example, is much easier to read and perform reports against than several separate tables.

The departments responsible for each type of employee can then update and keep their own data in order without causing problems for the others.

Note also that there is no sideways manipulation; that is, a record added and maintained in the accountsstaff table is not visible in the itstaff table and vice versa.

Another added advantage is that any schema changes made to the parent table are instantly replicated to inherited tables.

If we add a new column to our staff table:

alter table staff add column age integer;

select * from staff

we get:

Output pane

And then going on to do:

select * from itstaff

and

select * from accountsstaff

we get:

Output pane

and

Output pane

Both changes are instantly useable as described previously, so updating the age in the child tables will show up in the parent tables.

There is one instance where data changes in the parent table do show up in the child tables, and that is if you change data that was inserted into the database using a child table.

Take the following example:

select * from staff

Output pane

Peter Shaw 2 shows up because it was inserted as part of the record in itstaff. If we update that as follows:

update staff set age = 21 where fullname = 'Peter Shaw 2'

select * from staff

Output pane

And then also perform:

select * from itstaff

Output pane

We can see that the data has propagated down.

This shouldn't be a surprise though. It makes sense that anything created elsewhere, and that can be modified elsewhere, would show up in different places.

It does mean, however, that when you design databases using inherited tables, you absolutely must make sure you have your designs properly planned out beforehand.

If you’re a software developer who is accustomed to using strict object-oriented principles, then what you've just done here will likely make a lot of sense. Postgres is a database that's very strong on object-oriented design and allows you to treat data stores just as you would treat objects in a software application.

Using inheritance is a very powerful tool, but can also cause a lot of problems that can be hard to diagnose. I know because I'm usually the one diagnosing them.

The section on table inheritance in the Postgres manual can be found at http://www.postgresql.org/docs/9.2/static/ddl-inherit.html.

Processing XML Data

Every database has some form of XML processing built into it, and most of them are horribly complicated. Postgres, on the other hand, makes handling XML simple with a slew of very easy-to-use XML construction functions.

It also natively supports an XML data type, and can insert XML data directly into a database field and then perform queries directly on that XML data using XPath. We’ll cover that later.

To create data suitable for an XML field in a database table, use the xmlparse function, which parses XML character data and returns an XML data type.

Xmlparse takes two parameters, one describing the scope of the generated XML string, and the other the XML data itself. Note, however, that xmlparse is defined in the SQL 2008 standards document. In typical fashion, Postgres has its own way of performing the deed with simpler syntax, as well as supporting the standards.

xml '<tag>tagvalue</tag>'

'<tag>tagvalue</tag>'

I'll use the Postgres syntax for the rest of this section. If you want to read up on how to use xmlparse you can do so on the XML data type page in the Postgres manual at http://www.postgresql.org/docs/9.2/static/datatype-xml.html.

Here is a quick example for those who want one to try now:

XMLPARSE (CONTENT 'abc<foo>bar</foo><bar>foo</bar>')

You can change CONTENT for DOCUMENT and the output XML will then also include a valid XML document type declaration.

The inverse operation is the xmlserialize function, which returns a character string value from an XML data type as follows:

select xmlserialize(CONTENT 'tagvalue'::xml as character varying)

The xmlserialize function

The xmlserialize function takes a Postgres XML data type and converts it back into a standard XML string, unlike the preceding xmlparse statement which takes a standard XML formatted string and parses it into an XML data type. Using the two together (or the simplified syntax if you prefer) is the key to getting XML data into and out of a Postgres database in its own native XML storage type. You'll also notice that both of them can take either a CONTENT or DOCUMENT keyword. The difference is that DOCUMENT will produce XML output that has a full document type header attached to it, whereas CONTENT will not.

If your intention is to produce XML that will be saved to a stand-alone XML data file, then you should use DOCUMENT to ensure that you attach any extra information required by the target. If you’re producing snippets of XML to be inserted inline in another document, or if you’re producing XML that will be output by an XML web service, then CONTENT is more appropriate.

You might notice that in the previous example, there are no XML tags in the text passed to xmlserialize.

The reason for this is even when you parse a value out of the string, it still gets presented as an XML data type. Generally you would use XPath to grab the value you need, and then xmlserialize to convert it to the correct type.

A word of warning though: the XML data type doesn't validate using DTDs or XML Schema documents; it is possible to put malformed XML into XML fields if you're not careful.

Once you master using the XML data type, you can move on to using the rich set of XML functions available. I'm not going to cover all of them in this book, just the basics. You can find the whole list of them along with examples at http://www.postgresql.org/docs/9.2/static/functions-xml.html.

The first thing we’re going to look at is producing XML data from data stored in your database tables.

If you've been reading from the beginning, you'll recall that previously we created a table called mytable. It looks like the following:

select * from mytable

mytable

We'll be using this for our XML sample in just a moment.

All of the Postgres XML functions start with xml, and all of them build on previous functionality provided by the others.

Starting with something simple, we have xmlcomment:

select xmlcomment('This is a comment')

This will produce the following:

xmlcomment

Following that, we have xmlconcat, which concatenates two discrete bits of XML data together to form one. Note that it does not nest elements.

select xmlconcat('<tag>Hello</tag>','<tag>World</tag>')

xmlconcat

Just like everywhere else in Postgres, you can nest these functions too:

select xmlconcat(xmlcomment('This is a comment'),'<tag>Hello World</tag>')

xmlconcat

The next function, and the one you are most likely to use very often, is xmlelement. This function allows you to construct elements from raw data like so:

select xmlelement(name tag, 'Hello World')

xmlelement

Like the others, it can be easily nested:

select xmlconcat(xmlcomment('This is a comment'), xmlelement(name tag, 'Hello World'))

xmlelement

The xmlelement function can also be nested inside other xmlelements, and when you start to do this, you start to be able to nest and wrap XML tags:

select

xmlelement(name outertag,

xmlelement(name innertag, 'tagvalue')

)

xmlelement

Going one step further, xmlelement can nest multiple recurrences:

select

xmlelement(name outertag,

xmlelement(name innertag1, 'tagvalue'),

xmlelement(name innertag2, 'tagvalue')

)

xmlelement

So how do you use this to export tables to XML? Quite simply: Wrap multiple xmlelements into a select that selects the table data you wish to export using another function, xmlagg.

select

xmlelement(name users,

xmlagg(

xmlelement(name user,

xmlelement(name gid, gid),

xmlelement(name name, name),

xmlelement(name email, email),

xmlelement(name webpage, webpage)

)

from mytable

The xmlagg function is slightly different than the other XML functions. Its purpose, very much like the aggregate functions in general, is to take multiple generated rows from a select query and aggregate them into one XML data row.

As can be seen in the previous example, xmlagg takes the multiple rows of output produced by the inner xmlelement call, and then aggregates them into one set of XML tags representing the whole table, before one final xmlelement call is used to wrap each row in an outer users tag. The result is the following XML, which I've formatted into multiple lines for readability.

Make a note of this XML output; we'll be using it later when we read an XML file back into your Postgres server. You can Copy and Paste it into a file called users.xml to make it ready for later use.

<users>

<user>

<name>Peter Shaw</name>

<email>[email protected]</email>

<webpage>http://shawtyds.wordpress.com</webpage>

</user>

<user>

<name>Peter Shaw 2</name>

<email>[email protected]</email>

<webpage>http://shawtyds.wordpress.com</webpage>

</user>

<user>

<name>Peter Shaw 5</name>

<email>[email protected]</email>

<webpage>http://shawtyds.wordpress.com</webpage>

</user>

<user>

<name>Alan Person</name>

<email>[email protected]</email>

<webpage>http://shawtyds.wordpress.com</webpage>

</user>

</users>

Postgres also has one quicker way to export your entire table to XML, table_to_xml.

select table_to_xml('mytable',TRUE,TRUE,'')

table_to_xml

I'm not going to cover the different options here, but by changing them you can change the layout of the XML quite substantially. They are documented on the manual page listed previously (http://www.postgresql.org/docs/9.2/static/functions-xml.html), along with similar functions that will generate DTD and schema documents for your XML based on the original SQL used to create your table.

That leaves us with one last question…

How do you get your XML data back into Postgres? Unfortunately, it's not quite as straightforward as the rest. It’s not difficult, but it is slightly more involved than what we've seen so far.

Essentially, to “shred” your XML into its different parts, you need to physically parse the XML data using the XPath.

If you've used XPath before in other places, it's quite straightforward once you get the hang of it, although it can appear confusing at first.

I'm not going to go into too much detail (there are plenty of other guides on XPath should you need them), but here’s a brief primer.

If we have the following XML snippet:

<users>

<user>

<name>Peter Shaw</name>

<email>[email protected]</email>

<webpage>http://shawtyds.wordpress.com</webpage>

</user>

</users>

We can get the value from the name element by using the following path:

/users/user/name/text()

As you can see, it's rather like navigating your file system. Once you know the path to the file you want (or in this case the element value), you can simply pick it up and use it.

To get our data out using an XPath, simply use the xpath function (note that where it says “xml abbreviated for readability,” you need to copy and paste the XML from the previous XML snippet, or if you’re feeling more adventurous, you can also paste the XML from the full user document).

select xpath('/users/user/name/text()',xml '…xml abbreviated for readability…')

This should return the following result:

xpath

If you pasted just the previous XML, there will be only one entry as shown in the preceding image. If you pasted the full document, there will be as many names as there are user records in the XML you used.

As you can see, the result is an array that contains multiple XML data types. Remember what I was saying previously about xmlserialize not having a tag in the string and needing to use XPath to ensure you got a string rather than an XML data type? Well here's where you would use it to ensure that happened rather than returning an XML data type.

Another thing to note: If you had multiple occurrences of the result in your path, then the array would have multiple elements in its result array. You’ll see how to deal with this in just a moment.

One more thing you need to know before we can write our import routine is how to get your XML file loaded into Postgres.

Postgres has a built-in file loading routine, but it has one big limitation: It cannot access the file system outside of your Postgres data directory.

If you remember from previous chapters, this is the folder within your Postgres installation where Postgres keeps its physical data files. The Server Configuration option in the Tools menu will remind you if you've forgotten where it is. You can create folders here to organize your files, but you cannot access anything above the root data folder.

There are add-ons for Postgres that allow you to access files outside this folder, and most of the procedural languages such as PL/Perl, PL/PHP, and PL/Python can easily navigate to other files in the file system, but that's beyond the scope of this book.

For now we'll use the users.xml that I asked you to save previously (when we exported the users table). Copy this file into a file in the data directory called users.xml. (Remember, I'm working on a Windows system.)

Once we do this, we can easily load the file by calling the pg_read_file() function like so:

select xmlparse(CONTENT pg_read_file('users.xml'))

You'll notice we wrap that in a call to xmlparse so that we get a proper XML data type to work with.

xmlparse

So now that we have all the pieces, what do we do with them? First, we need a Common Table Expression (CTE). A CTE is essentially a temporary table in memory that lasts only for the duration of the query being run.

We use a CTE in this case since we only want to read the file once. If we didn't use a CTE, then we would be rereading the file in every XPath function call. While that's not a big deal for our small file here, imagine what it might be like if you were reading a multi-node XML with hundreds of entries.

Once we have a CTE, we chain a second CTE onto our query to actually extract the data using XPath into an ordered set, ready to insert into the database.

For those who are not used to using CTEs, they look a little scary at first, but everything is quite simple. The order always flows from top to bottom; if you create a table in the first CTE, it's available in the second and third. If you create a table in the third CTE, it's available only in the output or any CTEs that follow from there. Anything that goes in or is generated above is always available below, a bit like a waterfall.

We could in theory do away with the second CTE and just insert straight from there, but by doing that we would either have to put up with all our data being of an XML data type, or only being able to cast to textual outputs such as varchar and text.

By using the second CTE, we give ourselves another opportunity to select over the final data set and perform any final data conversions we need. This is important if you have any data that's an integer or number, such as our gid column.

xmlserialize and ::<datatype> from an XML data field can only be cast to a textual output type. You cannot extract a value that's a numerical type and convert it in one step; this is one case where you must use the cast() function.

Finally, once we've defined that second CTE, we select over that final set of data, and then use a select into statement to create a new table in our database from the converted data. The final query to import our XML data back into Postgres looks like this:

with xmldata as

(

select xmlparse(CONTENT pg_read_file('users.xml'))

tabledata as

(

select

unnest(xpath('/users/user/gid/text()',(select * from xmldata))) as gid,

unnest(xpath('/users/user/name/text()',(select * from xmldata))) as name,

unnest(xpath('/users/user/email/text()',(select * from xmldata))) as email,

unnest(xpath('/users/user/webpage/text()',(select * from xmldata))) as webpage

)

select

cast(xmlserialize(CONTENT gid as text) as integer) as gid,

name::character varying(50) as name,

email::character varying(50) as email,

webpage::character varying(100) as webpage

into

mytable2

from

tabledata

Not as complicated as it seems from the description, but certainly more involved than some of what we've seen so far.

Returning Updates

Most developers are familiar with the standard SQL operations select, insert, delete, update, etc.

One thing that's difficult to do, however, is to know exactly what data was affected, and how it was affected, when performing queries that modify table rows.

We know select will always return rows of data, but in other operations we're used to just seeing something like “1 row(s) affected.”

Postgres has a neat trick related to this scenario called a return, as the following example shows. We'll be using the table we created from our XML in the last section.

select * from mytable2

Output pane

delete from mytable2 where gid = 3

Output pane

And using the return statement:

select * from mytable2

Output pane

delete from mytable2 where gid = 5 returning *

Output pane

select * from mytable2

Output pane

As you can see just by adding returns * to the end of our delete statement, the row that was deleted was returned as the result of our query.

Imagine now that if you wanted to create a backup table of all records deleted from the database, you could easily get this data back from your query just as you would with a regular select statement, and then you could write it directly into another table quite easily.

This also works with updates.

update mytable2 set email = '[email protected]' where gid = 2 returning *

Output pane

If you do this with an update you get a copy of the updated row sent back to you. Usually to get the updated data you would normally have to perform the update, and then follow that with a regular select to select the new data.

Inserts do exactly the same thing.

insert into mytable2 values(6,'a second person','[email protected]','http://aweb.page.com/') returning *

Output pane

You'll notice that in each of the previous examples I'm using returning *. As with any SQL query, when you use *, it means return everything.

If you only want the gid column, then you can use a regular column identifier as you would in a normal select statement:

insert into mytable2 values(7,'a third person','[email protected]','http://aweb.page.com/') returning gid

Output pane

I'll leave you to explore other possibilities here, but because it's a regular SQL construct you can use functions, aggregations, and most other Postgres features to use the returned data in some very clever ways, one of which we are going to explore next.

Writeable CTEs

I've already introduced you to CTEs in the section on XML; however, it's not something that's commonly known or used in many databases.

Microsoft SQL Server has had CTEs available for some time, and I have used them to solve many real-world problems where I needed to work with temporary tables, but I've not yet seen them used on a regular basis by other developers.

Classically however, CTEs have always been read-only, that is, you've only ever been able to do select-based queries using them.

Starting from Postgres v9.1 you can now perform inserts, updates, and deletes on a CTE. This opens up some very interesting possibilities.

Let’s start with a real-world example. Using the test database you created, let's add some new tables. I'm just going to give you the SQL here, but feel free to use pgAdmin to create them using the GUI.

create table items_in_stock

(

storename text,

itemname text,

amountavailable int not null,

costperitem numeric(16,2) not null

);

insert into items_in_stock values

('store 1','orangething', 45, 20.00),

('store 1','bluething', 50, 25.00),

('store 1','purplething', 30, 30.00),

('store 2','orangething', 60, 20.00),

('store 2','bluething', 10, 25.00),

('store 2','purplething', 25, 30.00);

create table sales_made

(

recordid serial primary key,

storename text,

itemname text,

amountsold int not null

);

insert into sales_made(storename, itemname, amountsold) values

('store 1', 'orangething', 5),

('store 2', 'bluething', 1),

('store 2', 'purplething', 10),

('store 1', 'bluething', 15),

('store 2', 'orangething', 10),

('store 1', 'bluething', 15);

Now let's suppose we'd like to see a report of the before and after stock amounts, and how much had been made in each store based on the sales made.

First, let's start with a normal read-only CTE that can summarize things easily:

with all_sales as

(

select

storename as storename,

itemname as itemname,

sum(amountsold) as totalamountsold

from

sales_made group by storename,itemname

sales_report as

(

select

iis.storename as storename,

iis.itemname as itemname,

iis.amountavailable as originalamountavailable,

als.totalamountsold as amountsold,

iis.amountavailable - als.totalamountsold as newamountavailable,

iis.costperitem as costperitem,

iis.costperitem * als.totalamountsold as amountmadeinsales

from items_in_stock as iis

join all_sales als on als.storename = iis.storename AND als.itemname = iis.itemname

)

select * from sales_report

This code should result in the following:

Output pane

As you can see, we are shown only those items that have been sold, how many we previously had, and how many we have now. We can also see how much we made from those sales.

Now the only problem we have is that the data we see is only in memory—our tables have not been updated in any way. Now we need to push those changes into our items in stock table.

We can do that easily by rewriting our query as follows:

with all_sales as

(

select

storename as storename,

itemname as itemname,

sum(amountsold) as totalamountsold

from

sales_made group by storename,itemname

sales_report as

(

select

iis.storename as storename,

iis.itemname as itemname,

iis.amountavailable as originalamountavailable,

als.totalamountsold as amountsold,

iis.amountavailable - als.totalamountsold as newamountavailable,

iis.costperitem as costperitem,

iis.costperitem * als.totalamountsold as amountmadeinsales

from items_in_stock as iis

join all_sales als on als.storename = iis.storename AND als.itemname = iis.itemname

items_update as

(

update items_in_stock iis

set amountavailable = sr.newamountavailable

from

sales_report sr

where

sr.storename = iis.storename AND sr.itemname = iis.itemname

returning

iis.storename as storename,

iis.itemname as itemname,

sr.originalamountavailable as originalamountavailable,

sr.amountsold as amountsold,

iis.amountavailable as newamountavailable,

iis.costperitem as costperitem,

sr.amountmadeinsales as amountmadeinsales

)

select * from items_update

Output pane

We get the same output as we did previously, only this time you'll notice we've added a third CTE to the chain.

This CTE has in it an update statement that actually updates the underlying items_in_stock table with the new values calculated in the sales_report CTE.

The result is that not only do we get our sales report returned, but our original data also gets updated for us in the process, all in a single SQL statement.

There is no need for creating separate temporary tables, or selecting data in those temporary tables to perform the updates separately. You didn't even need to put this into a stored procedure or database function either. Sometimes Postgres just makes solving the problem at hand a breeze.

Special Data Types

Our final section on Postgres-specific SQL will cover some of the unique data types that are available in the database.

You've already been introduced to the XML data type; now it's time to meet some of the others that can only be found in Postgres. Note that I'm only going to cover data types that are built into the system. Postgres, like many database systems, can be extended in many ways, including having new data types added to it. What I aim to show in this last section doesn’t require any such add-ons.

Geometry

Even without an extension such as PostGIS, Postgres has some basic geometry types by default, such as:

point
line
lseg
box
path
polygon
circle

For the purposes of this book I'll just show the simple ones. You can read details about the rest in the Postgres manual at http://www.postgresql.org/docs/9.2/static/datatype-geometric.html.

Let's create a table of points:

create table geometry(

recordid serial primary key,

thedot point

)

Now let's add some data:

insert into geometry(thedot) values

(point'(1,1)'),

(point'(2,2)'),

(point'(3,3)'),

(point'(4,4)')

As you can see, you need to specify the data type you’re adding. Unlike other data types, you can't use :: to cast a type at insert time, and specifying it on its own in parentheses is reserved for things like records and arrays.

select * from geometry

This line gives us:

Table of points

Now let's create some circles and lines.

create table geometry2(

recordid serial primary key,

theline lseg

)

create table geometry3(

recordid serial primary key,

thecircle circle

)

And we'll insert some data:

insert into geometry2(theline) values

(lseg '((1,1),(2,2))'),

(lseg '((3,3),(4,4))')

insert into geometry3(thecircle) values

(circle '((1,1),1)'),

(circle '((3,3),1)')

Line segments are created by specifying a start x position and y position, and an end x position and y position. Circles are created by specifying a center x position and y position, and a radius.

In our previous example we created a table with two lines from 1,1 to 2,2, and 3,3 to 4,4. We then created two circles, each with a radius of 1 unit at positions 1,1 and 3,3 respectively.

Once you have your data in your various tables, it should come as no surprise that there are a number of powerful functions that allow you to do all sorts of operations on the geometry you have stored. For example, to translate all your points by (1,1):

select * from geometry

Points to translate

select

recordid,

thedot + point '(1,1)' as thedot

from geometry

Translated points

Or to calculate the distance between our two circles:

select

(select thecircle from geometry3 where recordid = 1)

<->

(select thecircle from geometry3 where recordid = 2)

as distance

Calculated distance

There are many more functions and operators. A full list can be found at http://www.postgresql.org/docs/9.2/static/functions-geometry.html.

Network address types

Postgres also has a built-in data type for handling network address data such as IP addresses and MAC addresses. These addresses are the numbers used to connect computers and other devices to networks like the Internet.

An IP address you may have seen before is 127.0.0.1.

This is otherwise known as a localhost (I mentioned this previously in the section on installation), a software-level address allocated by the network software on the device.

Further down the chain is the unique hardware address that all devices have, called the MAC address. It usually looks something like this 00:16:E6:8E:BE:85.

Postgres can natively handle these data types. More importantly, when selecting, grouping, and sorting, it is aware of how to order them correctly.

As well as being able to store and manipulate them, you can also specify a network mask by using the forward slash notation: 192.168.100.0/24.

This would specify a network residing in 192.168.100 with a network mask of 24 bits, which means a class C network residing at network number 192.168.100 with hosts ranging from 0 to 255.

Unlike many of the other data types, there are no special functions for handling these data types. The strength in using them comes from the ability to search, index, and sort them natively at high speeds, while many other database systems have to handle them using strings and full text searches, which is very often inefficient.

The manual page for the network types can be found at http://www.postgresql.org/docs/9.2/static/datatype-net-types.html.

More data types?

Absolutely.

We have the ones you might expect, such as Numeric, Monetary, Date/Time, and Binary.

Then we have things like Json, UUID (Universal IDs similar to .NET GUIDs), Bit Strings, Number Ranges, Composite Record Types, Object Identifiers, and more. A full list can be found at http://www.postgresql.org/docs/9.2/static/datatype.html.

Connect and visualize PostgreSQL data

Seamless integration with PostgreSQL databases
Real-time data visualization for actionable insights
Customizable dashboards to meet your specific needs
Leverage PostgreSQL's scalability and performance

Get Your Free Trial Now

Postgres-Specific SQL

Data Type Conversion

Data Validation Constraints

Postgres Arrays

Inherited Tables

Processing XML Data

Returning Updates

Writeable CTEs

Special Data Types

Geometry

Network address types

More data types?

DISCLAIMER: Web reader is currently in beta. Please report any issues through our support system. PDF and Kindle format files are also available for download.