In the realm of PostgreSQL, the concept of dynamic SQL emerges as a powerful ally for database programmers, streamlining the querying process and minimizing repetitive tasks. By leveraging dynamic SQL, developers can efficiently manage operations such as daily table partitioning, adding missing indexes on foreign keys, or implementing data auditing features—all without extensive coding overhead. Moreover, dynamic SQL serves as a remedy for the side effects associated with PL/pgSQL caching, as queries executed through the EXECUTE statement are exempt from caching.
The essence of dynamic SQL is encapsulated within the EXECUTE statement, which takes a string input and evaluates it. The syntax for executing a statement is straightforward:
EXECUTE command-string [ INTO [STRICT] target ] [ USING expression [, ...] ];
Executing DDL statements in dynamic SQL
At times, operations need to be performed at the database object level, encompassing tables, indexes, columns, and roles. For instance, after deploying a new schema, a database developer may wish to vacuum and analyze specific schema objects to refresh the statistics. To analyze the tables within the car_portal_app schema, the following script can be employed:
DO $$
DECLARE
table_name text;
BEGIN
FOR table_name IN SELECT tablename FROM pg_tables WHERE schemaname ='car_portal_app' LOOP
RAISE NOTICE 'Analyzing %', table_name;
EXECUTE 'ANALYZE car_portal_app.' || table_name;
END LOOP;
END;
$$;
Executing DML statements in dynamic SQL
Dynamic SQL is particularly advantageous in applications that require interactive data manipulation. For example, consider a billing application that generates monthly data, where users may filter results based on various criteria. In the context of the car portal application, a dynamic predicate can facilitate account searches as illustrated below:
CREATE OR REPLACE FUNCTION car_portal_app.get_account (predicate TEXT)
RETURNS SETOF car_portal_app.account AS
$$
BEGIN
RETURN QUERY EXECUTE 'SELECT * FROM car_portal_app.account WHERE ' || predicate;
END;
$$ LANGUAGE plpgsql;
To validate the functionality of the above function, one might execute the following:
car_portal=> SELECT * FROM car_portal_app.get_account ('true') limit 1;
account_id | first_name | last_name | email | password
------------+------------+-----------+-----------------+-------------------
---------------
1 | James | Butt | jbutt@gmail.com |
1b9ef408e82e38346e6ebebf2dcc5ece
(1 row)
car_portal=> SELECT * FROM car_portal_app.get_account
(E'first_name='James'');
Dynamic SQL and the caching effect
As previously noted, PL/pgSQL employs execution plan caching, which is beneficial when the generated plan is expected to remain static. For instance, a query such as:
SELECT * FROM account WHERE account_id =
is likely to utilize an index scan due to its selectivity, making caching advantageous for performance. Conversely, in scenarios where the execution plan may vary, caching can lead to inefficiencies. For example, consider the following query that counts advertisements since a specified date:
SELECT count (*) FROM car_portal_app.advertisement WHERE advertisement_date >= ;
Here, the choice between an index scan and a sequential scan depends on the selectivity of the provided certain_date value. Caching the execution plan for such a query could result in significant issues. Thus, a better approach would involve rewriting the function using the EXECUTE command:
CREATE OR REPLACE FUNCTION car_portal_app.get_advertisement_count
(some_date timestamptz ) RETURNS BIGINT AS $$
DECLARE
count BIGINT;
BEGIN
EXECUTE 'SELECT count (*) FROM car_portal_app.advertisement WHERE
advertisement_date >= ' USING some_date INTO count;
RETURN count;
END;
$$ LANGUAGE plpgsql;
Recommended practices for dynamic SQL usage
While dynamic SQL offers flexibility, it also poses security risks, particularly concerning SQL injection vulnerabilities. This technique can be exploited to execute unauthorized SQL statements, potentially compromising sensitive data or damaging the database. A common example of a vulnerable PL/pgSQL function is as follows:
CREATE OR REPLACE FUNCTION car_portal_app.can_login (email text, pass text)
RETURNS BOOLEAN AS $$
DECLARE
stmt TEXT;
result bool;
BEGIN
stmt = E'SELECT COALESCE (count(*)=1, false) FROM car_portal_app.account
WHERE email = ''|| || E'' and password = ''||||E''';
RAISE NOTICE '%' , stmt;
EXECUTE stmt INTO result;
RETURN result;
END;
$$ LANGUAGE plpgsql;
This function checks if the provided email and password match an account. However, it is susceptible to SQL injection, as demonstrated below:
car_portal=> SELECT car_portal_app.can_login(E'jbutt@gmail.com'--', 'Do not know password');
In this case, the function erroneously returns true, highlighting the need for protective measures. To safeguard against SQL injection, consider the following best practices:
- Utilize the
USINGclause for parameterized dynamic SQL statements. - Employ the
formatfunction with proper interpolation to construct queries, ensuring that%Iescapes identifiers and%Lescapes literals. - Utilize
quote_ident(),quote_literal(), andquote_nullable()to format identifiers and literals appropriately.
By adhering to these practices, developers can harness the power of dynamic SQL while mitigating potential security risks.