Thursday, December 30, 2010

A row at a time.

This week, we will be discussing functions that use exactly one row.  Functions can take individual rows as parameters, and can generate individual rows.  We will be discussing various syntax forms used for these purposes.

But first, an aside:

As I am trying to think up examples of good PL/pgSQL functions, I keep asking myself the question: why use a function at all, rather than plain old SQL?   One reason is that a function can execute as the creator role, even when called by another role.  So it functions as a 'sudo', allowing narrow functionality to be used by a limited privilege role that would otherwise require a higher privilege.  The 'SECURITY DEFINER' phrase in the function definition gives the function that quality.  The default would be 'SECURITY INVOKER', where the function runs at the privilege of the caller.
CREATE OR REPLACE FUNCTION sudo( INT, INT )
  RETURNS INT
AS $$
BEGIN
  -- some operation requiring elevated privilege
END;
$$ LANGUAGE plpgsql
  SECURITY DEFINER;
If this function was created by a role with greater privilege (such as the Super role at an Rdbhost database), then it can be executed by another role (say a Reader or Preauth role at Rdbhost), and it can affect tables and other resources as if it were executed by the greater role.

Back to rows:

Table-name as Type
If your function is to accept a row of a particular table, you can just use the table name as the type in the parameter list.  FWIW, you are actually specifying an aggregate type which happens to share a name with the table.
CREATE OR REPLACE FUNCTION ctysize( cty capitols )
 RETURNS TEXT
AS $$
  BEGIN
    IF    cty.population > 10000000 THEN
         RETURN 'LARGE';
    ELSIF cty.population > 1000000 THEN
         RETURN 'MEDIUM';
    ELSE
         RETURN 'SMALL'; 
    END IF;
  END;
$$ LANGUAGE plpgsql;

SELECT ctysize(ROW(city,country,population)) FROM capitols LIMIT 3;
This could have been done in plain SQL using a CASE, but this does illustrate how to create a ROW from a set of fields.

There does not seem to be a way to pass a generic table-row as a parameter, so the above could not be written to categorize the populations of just any table, generically, that has a population field.

ROWTYPE
Before we move on to outputting rows, let's look at a syntax feature that can facilitate creating rows.  We can declare a variable, in the DECLARE section of the function, to be of a table's ROWTYPE.  It has attributes for each column of that table, which can then be assigned to.  They can be read, as well, though they will initially be NULL.
DECLARE
    ctyrec capitols%%ROWTYPE;
  BEGIN 


RETURNS tablename
This example illustrates using a table-name as the return type, meaning that the return value will match that row type.  If you are testing these on Rdbhost, you need to double up the '%', as the '%R' gets confused with a substitution token and errors about 'too few arguments'.
CREATE OR REPLACE FUNCTION newcty(name TEXT, nation TEXT, size INT)
 RETURNS capitols
AS $$
  DECLARE
    ctyrec capitols%ROWTYPE;
  BEGIN
    ctyrec.city := name;
    ctyrec.country := nation;
    ctyrec.population := size;
    RETURN ctyrec;
  END;
$$ LANGUAGE plpgsql;

SELECT * FROM newcty('caracas','venezuala',10000);
The example shows how to create a new record using values provided as arguments.  It could be used as input to an SQL insert, like:
INSERT INTO capitols SELECT * FROM newcty('caracas','venezuala',10000);
Again, this example is so lame that it could be replaced with a straight SQL query of similar complexity.  Good simple examples are hard to find.  I was disappointed to discover that the ROWTYPE declaration does not imply inclusion of constraints.  You can put values into the compound-variable that are not permissible in the table, and you won't learn until you attempt to insert it into the table.

RETURNS record
In the last code sample, we saw how a function can return a record by declaring the return type as the tablename, and using a record of that row-type as the return value.

An alternative method of returning a row value is to use OUT parameters.  The parameters represent columns of the resulting row.  This example uses only OUT parameters, though you can use IN (and INOUT) parameters in the same parameter list.  In this example, a row of 3 values is returned, and that row is identical to the above example.
CREATE OR REPLACE FUNCTION newctyrec( name TEXT, nation TEXT, pop INT,
                                      OUT TEXT, OUT TEXT, OUT INT )
 RETURNS record
AS $$
  BEGIN
    $4 = name;
    $5 = nation;
    $6 = pop;
  END;
$$ LANGUAGE plpgsql;

SELECT newctyrec('caracas','venezuala',1000);
The IN parameters are not declared as IN, since that is the default mode.  The OUT parameters are referenced by their number, as I did not give them names.  The 'record' return type represents whatever row shape is indicated by the OUT parameters.  Since the OUT parameters are not in fact, parameters at all, the confusion factor in this is high.  I recommend formatting your function definitions so that the OUT parameters are on their own line.

Forget any notion of reference variables from other languages, which they may sorta look like.  They are just a list of columns to return, confusingly appended to the parameter list.

The code sample above returns the following record.
(caracas,venezuala,1000)
There it is; not the prettiest syntax, but it works.

No comments:

Post a Comment