Multiple records returned by a function.
Example:
convert integers to words
'41' => 'forty one'
'17' => 'seven teen'
'501' => 'five hundred one'
illustrate converting one integer, and then returning a whole series.
Also shows use of array variables
CREATE OR REPLACE FUNCTION namenumbers( n INTEGER )
RETURNS VARCHAR
AS $$
DECLARE
ary INTEGER[];
BEGIN
ary[0] := 1;
RETURN 'one';
END;
$$ LANGUAGE plpgsql;
SELECT namenumbers(%s);
SELECT ... INTO
A lesson a week in PL/pgSQL. As I learn how to do useful things in PL/pgSQL, I will share the lessons with you.
Monday, January 6, 2014
Thursday, January 27, 2011
More elegant forms of returning records.
Like the blog subtitle says, I learn PL/pgSQL as I go, and share the lessons with you. A couple of weeks ago, I posted on returning a few records using OUT parameters. I used that method because it is what I knew. A reader, Tzvi R., kindly commented with a couple of ways to return records more gracefully. In this post, I am giving those methods more prominence than the comment provided.
SETOF RECORD
Here the record is defined within the function, keeping the parameter list a clean list of inputs. A limitation of this approach is that since the record definition is hidden from the caller, the column list must be provided as part of the retrieving
Line 5 creates a variable, of indeterminate shape, for the output record. Line 8 creates the record using a
Line 9 is a
Predefined Types
If you define the record type outside of the function, and use it in the function definition, the caller no longer needs a column list, but determines columns from the record type.
Line 1 is the type definition, defining 'square_type' as a pair of
In my opinion, this is the most elegant of the three forms; it does require an additional line to define the type, and an additional addition to the namespace, but it reads easily. The next best is the OUT parameter method. Having to specify the row shape in the calling SELECT, as in the first example above, is just too awkward.
One caution is that there is no
Thank you, Tzvi, for your assistance.
SETOF RECORD
Here the record is defined within the function, keeping the parameter list a clean list of inputs. A limitation of this approach is that since the record definition is hidden from the caller, the column list must be provided as part of the retrieving
SELECT.CREATE OR REPLACE FUNCTION squares(ct INT)
RETURNS SETOF RECORD
AS $$
DECLARE
v_rec RECORD;
BEGIN
FOR i IN 0..ct-1 LOOP
SELECT i, POWER(i,2)::INT INTO v_rec;
RETURN NEXT v_rec;
END LOOP;
END;
$$
LANGUAGE 'plpgsql' IMMUTABLE;
SELECT * FROM squares(5) AS (A INT, B INT);Line 5 creates a variable, of indeterminate shape, for the output record. Line 8 creates the record using a
SELECT .. INTO, and inserts it into the variable, giving the variable its shape. Line 9 is a
RETURN NEXT.., which adds the record to the list of records to eventually return. This form of RETURN does not terminate the function, but rather augments the return value, which gets returned whenever the function returns. In this example, we just passively fall out the bottom of the function, but a bare RETURN can be used to explicitly exit the function. The bare RETURN does not interfere with the returning of the built-up result set.Predefined Types
If you define the record type outside of the function, and use it in the function definition, the caller no longer needs a column list, but determines columns from the record type.
CREATE TYPE square_type AS (a INT, b INT);
CREATE OR REPLACE FUNCTION squares(ct INT)
RETURNS SETOF square_type
AS $$
DECLARE
v_rec square_type;
BEGIN
FOR i IN 0..ct-1 LOOP
SELECT i, POWER(i,2)::INT INTO v_rec;
RETURN NEXT v_rec;
END LOOP;
END;
$$
LANGUAGE 'plpgsql' IMMUTABLE;
SELECT * FROM squares(5);Line 1 is the type definition, defining 'square_type' as a pair of
INT fields. Lines 4 and 7 define the return type and the variable type as 'square_type'. Because the caller can see the return type of the function from the function definition, it knows to treat the return values as pairs of INT columns, and Line 17 can be appealingly spare. In my opinion, this is the most elegant of the three forms; it does require an additional line to define the type, and an additional addition to the namespace, but it reads easily. The next best is the OUT parameter method. Having to specify the row shape in the calling SELECT, as in the first example above, is just too awkward.
One caution is that there is no
CREATE OR REPLACE TYPE <typename>, so you need to use a DROP TYPE <typename> if you wish to repeat the execution of the above quoted code block complete. In another week or two, I plan to show how to use Exception handling to gracefully absorb the errors produced by redundantly creating a type.Thank you, Tzvi, for your assistance.
2 Feb changed function types to IMMUTABLE
Thursday, January 20, 2011
Beyond Little Bobby Tables
The xkcd comic about 'little bobby tables' is a classic, widely known and often quoted:
the strip title is 'Exploits of a Mom', and the punch-line is 'I hope you've learned to sanitize your database inputs'.
The comic gets quoted and linked often, generally to illustrate the point about sanitizing database inputs. While the verb sanitize is a terse comic-scale expression to represent avoiding SQL injection attacks, it is only part of securing the database. To sanitize is to remove the dangerous parts of the input. Sanitizing the input in the xkcd example would presumably involve escaping the singe quote, and would have protected the database.
However, there are other ways to accomplish the desired safety.
the strip title is 'Exploits of a Mom', and the punch-line is 'I hope you've learned to sanitize your database inputs'.
The comic gets quoted and linked often, generally to illustrate the point about sanitizing database inputs. While the verb sanitize is a terse comic-scale expression to represent avoiding SQL injection attacks, it is only part of securing the database. To sanitize is to remove the dangerous parts of the input. Sanitizing the input in the xkcd example would presumably involve escaping the singe quote, and would have protected the database.
However, there are other ways to accomplish the desired safety.
- If you use parameterized queries, the inputs are quoted for you.
- If you embed the query in a procedure, with typed parameters, the parameters are firstly forced to match the parameter type, and are then safely interpolated into the embedded query.
- Using a limited role to run the query, a role that does not have permissions to do more hazardous operations than are required for the query. For example a role sufficient to the xkcd example would require INSERT privilege, but not DROP nor DELETE nor CREATE.
These three can all be used together, providing multi-layered security. Using procedures with typed parameters will neutralize SQL injections which are matched to parameter types other than char, varchar, text, and blob. The SQL injections which map to valid textual types reach the query, but are implicitly quoted and escaped. Should someone defeat the escaping mechanism somehow (there have been a few exploits in PostgreSQL over the years, not necessarily in the escaping), then the role privilege limits would prevent the most severe of data losses.
Thursday, January 13, 2011
Returning a few rows
This week, we will look at returning a few rows from a function. Database programming typically involves getting sets of records, so this output ability is natural to look for, in a programming language that runs internally to a database engine.
Here's an example:
Record Type
The third line declares the return type to be SETOF RECORD, indicating that it will return a variable number of elements whose type is RECORD, to be specified elsewhere. The 'elsewhere' is the set of OUT parameters, which indicate that the RECORD is two integers. The OUT variables are shown in line 2. Line 1 names the function, and declares one input parameter to take a count limit.
Lines 6 through 8 define a loop, which iterates over values from 0 to the count limit. Each iteration runs line 7, which we will discuss in the next paragraph. Lines 4,5,9,10 are just boilerplate that is found in every PL/pgSQL function. Line 12 provides a context in which to run the function and display the results.
Get Record from SELECT
Line 7 is:
Starting in version 8.4, there is a nice
Multiple Records
If the embedded query returns multiple records, that works too, as all such records get included; witness this variation to the example, which provides both squares and cubes:
RETURN NEXT
An alternate way to return records is with the
Later PostgreSQL versions support more elegant syntax for returning records, but the above patterns do the job, and can still be used in current versions.
That's enough for this week. This week, we looked at getting multiple records out of the function. Next week, we will look at getting multiple rows into the function to work with.
Here's an example:
CREATE OR REPLACE FUNCTION squares(IN ct INTEGER, OUT INTEGER, OUT INTEGER) RETURNS SETOF RECORD AS $$ BEGIN FOR i IN 0..ct-1 LOOP RETURN QUERY SELECT i, POWER(i,2)::INTEGER; END LOOP; END; $$ LANGUAGE 'plpgsql'; SELECT * FROM squares(3);
Record Type
The third line declares the return type to be SETOF RECORD, indicating that it will return a variable number of elements whose type is RECORD, to be specified elsewhere. The 'elsewhere' is the set of OUT parameters, which indicate that the RECORD is two integers. The OUT variables are shown in line 2. Line 1 names the function, and declares one input parameter to take a count limit.
Lines 6 through 8 define a loop, which iterates over values from 0 to the count limit. Each iteration runs line 7, which we will discuss in the next paragraph. Lines 4,5,9,10 are just boilerplate that is found in every PL/pgSQL function. Line 12 provides a context in which to run the function and display the results.
Get Record from SELECT
Line 7 is:
RETURN QUERY SELECT i, POWER(i,2):INTEGER. The RETURN tells us this line is part of the results. The QUERY catches the results of the subsequent SELECT, as an ordinary SELECT statement is not permitted in PL/pgSQL. The SELECTs in PL/pgSQL must be qualified somehow, with an INTO clause or a QUERY as above. The rest of the line just creates a two part record of the counter index and its square. The ::INTEGER cast avoids a type error; as POWER is a NUMERIC returning function.Starting in version 8.4, there is a nice
TABLE syntax, for returning records and I will discuss that in a future post. The Rdbhost server does not support 8.4 at present, but is scheduled for an upgrade.Multiple Records
If the embedded query returns multiple records, that works too, as all such records get included; witness this variation to the example, which provides both squares and cubes:
BEGIN
FOR i IN 0..ct-1 LOOP
RETURN QUERY SELECT i, POWER(i,2)::INTEGER
UNION SELECT i, POWER(i,3)::INTEGER;
END LOOP;
END;RETURN NEXT
An alternate way to return records is with the
RETURN NEXT syntax. This example names the two output parameters, assigns to them, and then adds their aggregate to the return set.DROP FUNCTION squares(INTEGER);
CREATE OR REPLACE FUNCTION squares(IN ct INTEGER,
OUT a INTEGER, OUT b INTEGER)
RETURNS SETOF RECORD
AS $$
BEGIN
FOR i IN 0..ct-1 LOOP
a := i;
b := POWER(i,2)::INTEGER;
RETURN NEXT;
END LOOP;
END;
$$ LANGUAGE 'plpgsql';
SELECT * FROM squares(3);
Line 3 names the output parameters. Lines 8 and 9 put values into those named output parameters. Line 10 RETURN NEXT creates a record from the above parameters, and adds it to the result set. The columns created are named after the parameters, so you might want names longer and more informative than 'a' and 'b'.Later PostgreSQL versions support more elegant syntax for returning records, but the above patterns do the job, and can still be used in current versions.
That's enough for this week. This week, we looked at getting multiple records out of the function. Next week, we will look at getting multiple rows into the function to work with.
Thursday, January 6, 2011
What's next..
PL/pgSQL is basically a language for writing functions. You cannot just inline a PL/pgSQL statement into a query wherever. You define functions, and then use those functions to enhance queries, or to perform automated actions as triggers.
I will not be covering PL/pgSQL comprehensively, but in terms of what I would find useful to have. Here is a short list of stuff I hope to cover:
I hope to write a post a week, published on Thursday, but I'm only 4 or 5 weeks in, and am already struggling to keep up, using this outline as a stop-gap post for this week. I hope to have something more substantial for next week, and stay a post or two ahead; we will see how it goes.
I will not be covering PL/pgSQL comprehensively, but in terms of what I would find useful to have. Here is a short list of stuff I hope to cover:
- How to structure a function; what are the minimum set of parts.
- How to update or insert a record with a PL function call.
- How to return a simple calculated value
- How to return a few selected records, with and without enhancing calculations
- How to return generated data as if they were a set of records
- How to return summaries of selected data
- How to write adaptive functions, that change their return sets based on what tables they are used on.
- How to create a trigger from a function.
I hope to write a post a week, published on Thursday, but I'm only 4 or 5 weeks in, and am already struggling to keep up, using this outline as a stop-gap post for this week. I hope to have something more substantial for next week, and stay a post or two ahead; we will see how it goes.
Thursday, December 30, 2010
A row at a time.
This week, we will be discussing functions that use exactly one row. Functions can take individual rows as parameters, and can generate individual rows. We will be discussing various syntax forms used for these purposes.
But first, an aside:
As I am trying to think up examples of good PL/pgSQL functions, I keep asking myself the question: why use a function at all, rather than plain old SQL? One reason is that a function can execute as the creator role, even when called by another role. So it functions as a 'sudo', allowing narrow functionality to be used by a limited privilege role that would otherwise require a higher privilege. The 'SECURITY DEFINER' phrase in the function definition gives the function that quality. The default would be 'SECURITY INVOKER', where the function runs at the privilege of the caller.
Back to rows:
Table-name as Type
If your function is to accept a row of a particular table, you can just use the table name as the type in the parameter list. FWIW, you are actually specifying an aggregate type which happens to share a name with the table.
There does not seem to be a way to pass a generic table-row as a parameter, so the above could not be written to categorize the populations of just any table, generically, that has a population field.
ROWTYPE
Before we move on to outputting rows, let's look at a syntax feature that can facilitate creating rows. We can declare a variable, in the DECLARE section of the function, to be of a table's ROWTYPE. It has attributes for each column of that table, which can then be assigned to. They can be read, as well, though they will initially be NULL.
RETURNS tablename
This example illustrates using a table-name as the return type, meaning that the return value will match that row type. If you are testing these on Rdbhost, you need to double up the '%', as the '%R' gets confused with a substitution token and errors about 'too few arguments'.
RETURNS record
In the last code sample, we saw how a function can return a record by declaring the return type as the tablename, and using a record of that row-type as the return value.
An alternative method of returning a row value is to use OUT parameters. The parameters represent columns of the resulting row. This example uses only OUT parameters, though you can use IN (and INOUT) parameters in the same parameter list. In this example, a row of 3 values is returned, and that row is identical to the above example.
Forget any notion of reference variables from other languages, which they may sorta look like. They are just a list of columns to return, confusingly appended to the parameter list.
The code sample above returns the following record.
But first, an aside:
As I am trying to think up examples of good PL/pgSQL functions, I keep asking myself the question: why use a function at all, rather than plain old SQL? One reason is that a function can execute as the creator role, even when called by another role. So it functions as a 'sudo', allowing narrow functionality to be used by a limited privilege role that would otherwise require a higher privilege. The 'SECURITY DEFINER' phrase in the function definition gives the function that quality. The default would be 'SECURITY INVOKER', where the function runs at the privilege of the caller.
If this function was created by a role with greater privilege (such as the Super role at an Rdbhost database), then it can be executed by another role (say a Reader or Preauth role at Rdbhost), and it can affect tables and other resources as if it were executed by the greater role.CREATE OR REPLACE FUNCTION sudo( INT, INT ) RETURNS INT AS $$ BEGIN -- some operation requiring elevated privilege END; $$ LANGUAGE plpgsql SECURITY DEFINER;
Back to rows:
Table-name as Type
If your function is to accept a row of a particular table, you can just use the table name as the type in the parameter list. FWIW, you are actually specifying an aggregate type which happens to share a name with the table.
CREATE OR REPLACE FUNCTION ctysize( cty capitols )
RETURNS TEXT
AS $$
BEGIN
IF cty.population > 10000000 THEN
RETURN 'LARGE';
ELSIF cty.population > 1000000 THEN
RETURN 'MEDIUM';
ELSE
RETURN 'SMALL';
END IF;
END;
$$ LANGUAGE plpgsql;
SELECT ctysize(ROW(city,country,population)) FROM capitols LIMIT 3;This could have been done in plain SQL using a CASE, but this does illustrate how to create a ROW from a set of fields.There does not seem to be a way to pass a generic table-row as a parameter, so the above could not be written to categorize the populations of just any table, generically, that has a population field.
ROWTYPE
Before we move on to outputting rows, let's look at a syntax feature that can facilitate creating rows. We can declare a variable, in the DECLARE section of the function, to be of a table's ROWTYPE. It has attributes for each column of that table, which can then be assigned to. They can be read, as well, though they will initially be NULL.
DECLARE
ctyrec capitols%%ROWTYPE;
BEGIN RETURNS tablename
This example illustrates using a table-name as the return type, meaning that the return value will match that row type. If you are testing these on Rdbhost, you need to double up the '%', as the '%R' gets confused with a substitution token and errors about 'too few arguments'.
CREATE OR REPLACE FUNCTION newcty(name TEXT, nation TEXT, size INT)
RETURNS capitols
AS $$
DECLARE
ctyrec capitols%ROWTYPE;
BEGIN
ctyrec.city := name;
ctyrec.country := nation;
ctyrec.population := size;
RETURN ctyrec;
END;
$$ LANGUAGE plpgsql;
SELECT * FROM newcty('caracas','venezuala',10000);
The example shows how to create a new record using values provided as arguments. It could be used as input to an SQL insert, like:INSERT INTO capitols SELECT * FROM newcty('caracas','venezuala',10000);Again, this example is so lame that it could be replaced with a straight SQL query of similar complexity. Good simple examples are hard to find. I was disappointed to discover that the ROWTYPE declaration does not imply inclusion of constraints. You can put values into the compound-variable that are not permissible in the table, and you won't learn until you attempt to insert it into the table.RETURNS record
In the last code sample, we saw how a function can return a record by declaring the return type as the tablename, and using a record of that row-type as the return value.
An alternative method of returning a row value is to use OUT parameters. The parameters represent columns of the resulting row. This example uses only OUT parameters, though you can use IN (and INOUT) parameters in the same parameter list. In this example, a row of 3 values is returned, and that row is identical to the above example.
CREATE OR REPLACE FUNCTION newctyrec( name TEXT, nation TEXT, pop INT,
OUT TEXT, OUT TEXT, OUT INT )
RETURNS record
AS $$
BEGIN
$4 = name;
$5 = nation;
$6 = pop;
END;
$$ LANGUAGE plpgsql;
SELECT newctyrec('caracas','venezuala',1000);
The IN parameters are not declared as IN, since that is the default mode. The OUT parameters are referenced by their number, as I did not give them names. The 'record' return type represents whatever row shape is indicated by the OUT parameters. Since the OUT parameters are not in fact, parameters at all, the confusion factor in this is high. I recommend formatting your function definitions so that the OUT parameters are on their own line. Forget any notion of reference variables from other languages, which they may sorta look like. They are just a list of columns to return, confusingly appended to the parameter list.
The code sample above returns the following record.
There it is; not the prettiest syntax, but it works.(caracas,venezuala,1000)
Thursday, December 23, 2010
PostgresOnline: Quick Guide to Writing PLPGSQL
Just this morning, I discovered a good short series on writing PL/pgSQL functions. These guides are written as part of the Postgres Online Journal, by Regina Obe and Leo Hsu. Aside from this short series (from back in 2008, but still very relevant), there is a lot of interesting stuff there.
The Quick Guide to Writing PL/pgSQL.
Part-1
Part-2
Part-3
They also produce a nice Cheat-sheet for PL/pgSQL.
Cheat-Sheet
The Quick Guide to Writing PL/pgSQL.
Part-1
Part-2
Part-3
They also produce a nice Cheat-sheet for PL/pgSQL.
Cheat-Sheet
Subscribe to:
Posts (Atom)
