PostgreSQL access authorization information is stored in two places: the pg_shadow system table and the pg_hba.conf system file. The pg_shadow table holds the information for specific database user accounts, along with password information and some system-level privilege information. The pg_hba.conf file controls which users can connect to which databases from which machines. Once authenticated, PostgreSQL keeps user authorization information stored primarily in the relacl column of the pg_class table. In this section, we will delve into the details pertinent to the purpose and structure of each of these parts.
The pg_shadow Table
The pg_shadow table contains detailed information about PostgreSQL users. It controls various system-level privileges and password information for database users. Looking at the pg_shadow table through the psql program, you see the following:
template1=# \d pg_shadow Table "pg_catalog.pg_shadow"
Column | Type | Modifiers ---+---+--- usename | name | not null usesysid | integer | not null usecreatedb | boolean | not null usesuper | boolean | not null usecatupd | boolean | not null passwd | text |
valuntil | abstime | useconfig | text[] |
The various fields of the pg_shadow table break down as follows:
• usename: Column that stores the username. Usernames must be unique cluster-wide.
• usesysid: Column that stores an internal system ID for each user. Normally, the system ID for the superuser created at compile time is listed as 1, and users created afterward begin at 100 and increase serially.
• usecreatedb: System-level privilege that allows the user to create new databases on the system. The default for new users is false.
• usesuper: System-level privilege that determines if a user is a superuser. Making a user a superuser is akin to giving someone root access on Unix or setting them up as a system administrator on Windows, and so this should be used only when absolutely necessary.
• usecatupd: System-level privilege that allows the user to directly update the system catalogs.
Because it is very easy to corrupt your database when modifying these tables, granting this privilege to users is not recommended.
• passwd: Field that stores the user’s password, usually in an MD5-style encrypted format.
• valuntil: Field that sets a timeframe for the user’s password to expire. Note that although the account in question remains unchanged, the user’s password will no longer work.
• useconfig: List that stores modified, session defaults for run-time configuration variables.
■Note Since the pg_shadow table could contain password information, it is not intended for general use.
Rather, in cases where users may want to find out system information, the pg_user view can be used instead. It looks exactly the same as pg_shadow, except the password information is always hidden behind a number of asterisks (*).
654 C H A P T E R 2 9 ■ S E C U R I N G P O S T G R E S Q L
The pg_hba.conf File
Client authentication is controlled by the pg_hba.conf file, which is typically found in the data directory of the PostgreSQL server. By default, the pg_hba.conf file is set to allow connections from the local machine only, but it gives you the flexibility to handle extremely complex connection requirements.
The basic format of pg_hba.conf is a list of single-line entries, with each entry containing a number of fields separated by tabs or spaces. Each line in the file represents an allowed connection, based on several different specified parameters. In this section, we take a more detailed look at each of the parts of a pg_hba.conf entry:
• TYPE: Describes the type of connection:
•local: Can only be made on the local Unix socket.
•host: Made via TCP/IP. You must also specify an address for PostgreSQL to listen on via the listen_addresses variable in the postgresql.conf file for TCP/IP connections to work.
•hostssl and hostnossl: Variants of the host connection that are used in conjunction with SSL connectivity; these are discussed later in this chapter.
• DATABASE: Specifies which database or databases the user is allowed to connect to.
Multiple databases can be specified with a comma-separated list of database names.
You can also use one of several keywords for further options:
•all: Signifies that the user can connect to all databases in the system.
•sameuser: Means that the user can only connect to a database with the same name as the user connecting.
•samegroup: Signifies that the user must belong to the group with the same name as the database they are attempting to connect to.
• USER: Specifies which user or users the specified connection rule applies to. Multiple users can be specified by using a comma-separated list of usernames. To use a group name, you should append a + to the name of the group. You can also use the keyword all to have the rule apply to all users.
• CIDR-ADDRESS: Specifies which client machines the given connection rule applies to.
The format is that of a numeric IP address followed by a valid CIDR mask length (e.g., 192.168.21.12/32). Note that bits to the right of the CIDR mask must be zero, and there cannot be any white space between the IP address, the /, and the mask. For example, if you wanted anyone on your local subnet to be able to connect, you would write the entry as 172.21.1.0/24. This field applies only to TCP/IP-based connection types.
• IP-ADDRESS + IP-MASK: As an alternative to the CIDR-ADDRESS notation, you can use sepa- rate IP-ADDRESS and IP-MASK entries. Using this notation, our example would look like 172.21.1.0 for the IP-ADDRESS field and 255.255.255.0 for the mask. Like the CIDR-ADDRESS notation, these fields apply only to TCP/IP-based connection types.
• METHOD: Specifies the authentication method that applies to the specified connection rule. Several different authentication methods are available. Only the most common methods are listed here, but you can consult the online documentation for more information:
•trust: Allows connections for the specified rule to connect without any type of authentication or verification of the user or their password. This method is not recommended for production machines.
•password: Requires that a password be supplied for any connecting user. The pass- word will be sent in plain text over the connection, so it is often recommended that this method should be used only in connection with some type of SSL arrangement.
•md5: Requires the connecting user to supply an MD5-encrypted password for authen- tication. Note that even though the password is encrypted, the connection still sends the hash via plain text, so it is not immune to sniffing-based attacks. While md5 is generally preferred over the password method, it too is best used in conjunction with some type of SSL connection.
•krb5: Uses Kerberos 5 to authenticate the user. This requires an external Kerberos key file and is available only for TCP/IP-based connections.
•pam: Authenticates the user via the Pluggable Authentication Modules service avail- able from the operating system.
•ident: Authenticates users based on the connecting client’s username, as determined by the operating system. You can create an optional ident map file if you want certain operating system users to be able to connect as different database users. Note that ident is not generally recommended as an authorization protocol, and therefore should be used only on machines on which the client can be well-secured.
•reject: Automatically rejects any connection matching the specified rule. This can sometimes be useful for filtering out certain connections from a larger group.
The order in which each row is placed in the pg_hba.conf is significant because PostgreSQL will authenticate incoming connections based on the first available match it finds within the file. For this reason, you will usually find that earlier entries will have strict connection-matching parameters along with weaker authentication methods, followed by more wide-reaching connection-matching parameters alongside tougher authentication methods. A typical pg_hba.conf might look something like this:
# Allow users on the local system to connect to any database under
# any username using Unix domain sockets
# TYPE DATABASE USER CIDR-ADDRESS METHOD local all all trust
# Implement the same permissions as above, but for connections on
# local loopback TCP/IP connections. (i.e. localhost)
# TYPE DATABASE USER CIDR-ADDRESS METHOD host all all 127.0.0.1/32 trust
656 C H A P T E R 2 9 ■ S E C U R I N G P O S T G R E S Q L
# Allow any client with IP address 192.168.76.x to connect to the
# "warehouse" database as user "reports" as long as a password is
# given
# TYPE DATABASE USER CIDR-ADDRESS METHOD host warehouse reports 192.168.76.0/24 password
# Allow user "rob" from host 192.168.21.12 to connect to database
# "template1" if the user's password is correctly supplied.
#
# TYPE DATABASE USER CIDR-ADDRESS METHOD host all rob 192.168.21.12/32 md5
# Allow connection from any IP address on the Internet to connect to
# either the bpsimple or bpfinal databases, provided that the user can
# pass an ident check for being either rick or neil
# TYPE DATABASE USER CIDR-ADDRESS METHOD
host bpsimple,bpfinal rick,neil 0.0.0.0/0 ident
The pg_class Table
Once a user has authenticated through the pg_hba.conf file, the next step of the connection is to determine whether the user is authorized to execute a given query. This duty falls primarily on information found in the pg_class table. The pg_class table contains a wide array of infor- mation about most of the different “table-like” objects in a PostgreSQL database, including tables, views, and indexes, but for the purposes of securing your database, the key column in this table is called relacl, which can be thought of as the “relations access control list.” The relacl column is rather cryptic at first glance, but its information can be deduced with a little direction. The relacl column’s data type is an array of aclitems, which is quite different from any other column you might have seen.
A typical relacl entry might look something like this:
phppg=# SELECT relname, relacl FROM pg_class WHERE relname='pg_class';
relname | relacl ---+--- pg_class | {=r/postgres}
(1 row)
This means that the user postgres has granted read permissions on the table pg_class to PUBLIC. But this is getting a little ahead of ourselves, so let’s take a moment to break down the different types of permissions that are available to users and what their corresponding entries would be.
The list of attributes you will find in the reacl column includes the following items:
• a: Stands for “append” and represents INSERT privileges.
• r: Stands for “read” and represents SELECT privileges.
• w: Stands for “write” and represents UPDATE privileges.
• d: Stands for “delete” and represents DELETE privileges.
• R: Stands for “rule” and allows the user to create or drop rules on the given relation.
• x: For the REFERENCES privilege. Users with this privilege can create foreign keys from other tables that reference the relations in question.
• t: For the TRIGGER privilege. Users with this privilege can create and drop triggers on the given relation.
An entry within the relacl column comprises one or more of the preceding attributes preceded with user information to create a complete privilege entry. If the user portion is left blank, the privileges listed are granted to PUBLIC, or all, users. In later versions of PostgreSQL, these entries are followed by a /username portion that signifies who granted the permissions in the entry. Let’s take a look at a few examples:
The first example demonstrates SELECT, INSERT, and UPDATE privileges for user rob, granted by user dylan:
rob=raU/dylan
The next example shows SELECT privileges for PUBLIC, granted by the Postgres superuser:
=r/postgres
Finally, this example demonstrates full privileges for user dylan, granted by user dylan, and INSERT and UPDATE privileges for PUBLIC, granted by user dylan:
{dylan=arwdRxt/dylan,=aw/dylan}
■Note The owner of an object gets full privileges by default. However, these privileges are not displayed in the relacl column by default. Instead, they become visible only when they have been explicitly granted by someone.