Thursday, December 9, 2010

Our first Function: Simple Record Insertion

This initial post will discuss a simple function to insert a record into a table.  The table is a simple list of capitol cities, with a city name field, and a country name field.

Here is the function definition, and line by line explanation will follow:

CREATE OR REPLACE FUNCTION  addcity
  ( cty VARCHAR, cntry VARCHAR ) 
  RETURNS void
AS $$
BEGIN
  INSERT INTO capitols (city,country)
                VALUES (cty,cntry);
END;
$$ LANGUAGE plpgsql;

SELECT * FROM addcity('paris','france');

Lines 1,4,5,8,9 are pretty much boiler plate, and you will see a similar elements on each PL/pgSQL function you ever read or write.  The others are specific to the purpose of the function.  Now line by line:
  1. The Create or replace function statement creates the function, over-writing any previous function with that signature.  The name, param list, and return definition combine to create the signature.  addcity is the name of our function, as it adds a city to our table.
  2. Line 2 is the parameter list, listing names and types of parameters passed to the function.  The type is required, the name is not.  There are automatic variables named $1, $2, etc for the parameters.  Note that we name the parameters *differently* than the fields.   Where the string 'cty' is used, it refers to the first parameter; there is no way to override that, so if the first parameter was named 'city', it would mask the field with the same name, and make the field unreferenceable.  So use unique names for parameters.
  3. The return value must be specified; since we do not have a meaningful return value, we indicate so with the 'RETURN void' statement. Return values may be indicated with OUT parameters, but that is outside the scope of this discussion.
  4. The '$$' symbol is known as a dollar quote, and opens a quoted string.  The dollar quote can be any non-whitespace characters between '$' signs, and must match at the beginning and end of the string.
  5. Each PG/pgSQL block is bounded by 'BEGIN' and 'END' statements.
  6. These two lines are just SQL with parameters interpolated into it by name.  See note 2 for why parameters are named differently than the fields.
  7. This ends the body of the function
  8. The '$$' closes the definition string, and the remainder of the line tells PostgreSQL that the function is in PL/pgSQL language.
  9. The SELECT statement provides a context for calling the function.  Just calling the function by itself is a syntax error, so we provide a no-results SELECT to call from.
The SQL in line 6 uses variable names where substitution parameters would be acceptable.  If you were to, for example, try to use a variable for the table name, it would fail with an error.    Say for some reason you had multiple tables all with city and country fields, and you wanted this function to work on a table selected at the time of the call.  You would pass the table name as a parameter, and you would have to use dynamic commands, created on the fly, like:

CREATE OR REPLACE FUNCTION  addcity
  ( tblnm VARCHAR, cty VARCHAR, cntry VARCHAR ) 
  RETURNS void
 AS $$
 BEGIN
   EXECUTE 'INSERT INTO ' || tblnm || ' (city,country) VALUES (\'' 
                          || cty || '\',\'' || cntry || '\' ) ';
 END;
 $$ LANGUAGE plpgsql;

 SELECT * FROM addcity('capitols','paris','france');

  1. The SQL is constructed of parts concatenated together with the '||' concatenation operator to create an executable query; otherwise this example is like the preceding.
This dynamic execution is clunky and fragile and insecure. For example, escaping of values is not done for the query dynamically created,  so SQL injection attacks might be a threat.  Consider carefully how you use such an approach.

Next week, we will be back with an introduction to the RdbAdmin program, and a walk-through on how to experiment with PL/pgSQL in that software.

.

No comments:

Post a Comment