Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 66 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
66
Dung lượng
2,3 MB
Nội dung
CHAPTER 10 ■ FUNCTIONS, STORED PROCEDURES, AND TRIGGERS 307 Summary In this chapter, we looked at ways in which we can extend the functionality of PostgreSQL queries. We have seen that PostgreSQL provides many operators and functions that we can use to refine queries and extract information. The procedural languages supported by PostgreSQL allow us to develop quite sophisticated server-side processing by writing procedures in PL/pgSQL, SQL, and other languages. This provides the opportunity for the database server to implement complex application function- ality independently of the client. Stored procedures are stored in the database itself and may be called by the application or, in the form of triggers, called automatically when changes are made to database tables. This gives us another means of enforcing referential integrity. For simple referential integrity, it’s generally best to stick to constraints, as they are more straightforward, efficient, and less error-prone. The power of triggers and stored procedures comes when your declarative constraints become very complex, or you wish to implement a constraint that is too complex for the declarative form. Now that we have covered some advanced PostgreSQL techniques, in the next chapter, we will move on to the topic of how to care for a PostgreSQL database. MatthewStones_4789C10.fm Page 307 Wednesday, February 23, 2005 6:47 AM MatthewStones_4789C10.fm Page 308 Wednesday, February 23, 2005 6:47 AM 309 ■ ■ ■ CHAPTER 11 PostgreSQL Administration In this chapter, we will look at how to care for a PostgreSQL database. This covers items ranging from configuring access to the system through managing the placement of database files, maintaining performance, and, crucially, backing up your system. As we progress through this chapter, we will cover the following topics: • System-level configuration of a PostgreSQL installation • Database initialization • Server startup and shutdown • User and group management • Tablespace management • Database and schema management • Backup and recovery • Ongoing maintenance of a PostgreSQL server While learning and experimenting with these administrative tasks, you will want to use a test PostgreSQL system that doesn’t contain any information you particularly care about. Making experimental system-wide changes or testing backup and restore procedures on a PostgreSQL database that contains live data is not a good idea. System Configuration We saw in Chapter 3 how to install PostgreSQL, but we didn’t really look in any depth at the resulting directory structure and files. Now we will explore the PostgreSQL file system and main system configuration options. The PostgreSQL file system layout is essentially the same on Windows and Linux platforms. On a Linux system, the base directory of the installation will vary slightly, depending on which installation method you used: installing from prepackaged executables, such as binary RPMs, or compiling it yourself from source code. There may also be fewer or more directories, depending on which options you installed. MatthewStones_4789C11.fm Page 309 Wednesday, February 23, 2005 6:48 AM 310 CHAPTER 11 ■ POSTGRESQL ADMINISTRATION On a Windows system, by default, your installation base directory will be something like C:\Program Files\PostgreSQL\8.0.0, under which you will find several subdirectories. On Linux, the base directory for a source code installation will generally be /usr/local/pgsql. For a prebuilt binary installation, the location will vary. A common location is /var/lib/pgsql, but you may find that some of the binary files have been put in directories already in the search path, such as /usr/bin, to make accessing them more convenient. Under the PostgreSQL base installation directory, you will normally find around seven subdirectories, depending on your options and operating system: • bin • data • doc • include • lib • man • share On Windows, the man subdirectory will be omitted, but probably at least one additional subdirectory, pgAdmin III, will be present. You will find additional directories, such and jdbc and odbc, if you installed some of the optional components. In this section, we will take a brief tour of the seven subdirectories, and along the way look at the more important configuration files and the significant options in them that we might wish to change. The bin Directory The bin directory contains a large number of executable files. Table 11-1 lists the principal files in this directory. Table 11-1. Principal Files in the bin Directory Program Description postgres Database back-end server postmaster Database listener process (the same executable as postgres) psql Command-line tool for PostgreSQL initdb Utility to initialize the database system pg_ctl PostgreSQL control—start, stop, and restart the server createuser Utility to create a database user dropuser Utility to delete a database user createdb Utility to create a database dropdb Utility to delete a database MatthewStones_4789C11.fm Page 310 Wednesday, February 23, 2005 6:48 AM CHAPTER 11 ■ POSTGRESQL ADMINISTRATION 311 The data Directory The data directory contains subdirectories with data files for the base installation, and also the log files that PostgreSQL uses internally. Normally, you never need to know about the subdirec- tories of the data directory. Also in this directory are several configuration files, which contain important configuration settings you may wish, or need, to change. Table 11-2 lists the user-accessible files in the data subdirectory. The pg_hba.conf File The hba (host based authentication) file tells the PostgreSQL server how to authenticate users, based on a combination of their location, type of authentication, and the database they wish to access. pg_dump Utility to back up a database pg_dumpall Utility to back up all databases in an installation pg_restore Utility to restore a database from backup data vacuumdb Utility to help optimize the database ipcclean Utility to delete shared memory segments after a crash (Linux only) pg_config Utility to report PostgreSQL configuration createlang Utility to add support for language extensions (see Chapter 10) droplang Utility to delete language support ecpg Embedded SQL compiler (optional, see Chapter 14) Table 11-2. User-Accessible Files in the data Subdirectory Program Description pg_hba.conf Configures client authentication options pg_ident.conf Configures operating system to PostgreSQL authentication name mapping when using ident-based authentication PG_VERSION Contains the version number of the installation, for example 8.0 postgresql.conf Main configuration file for the PostgreSQL installation postmaster.opts Gives the default command-line options to the postmaster program postmaster.pid Contains the process ID of the postmaster process and an identification of the main data directory (this file is generally present only when the database is running) Table 11-1. Principal Files in the bin Directory (Continued) Program Description MatthewStones_4789C11.fm Page 311 Wednesday, February 23, 2005 6:48 AM 312 CHAPTER 11 ■ POSTGRESQL ADMINISTRATION A common requirement is to add configuration lines to allow access to some, or all, data- bases from remote machines. At the time of writing, the default configuration is quite secure, preventing access to any database from any remote machine. (See the “Client Authentication” section in the PostgreSQL documentation for full details.) Each line in the pg_hba.conf file corresponds to a single allow or deny rule. Rules are processed in the order in which they appear in the file, so deny rules should generally precede allow rules. In PostgreSQL release 8.0, each line has the following five items: • TYPE: This column is usually local or host for local machines or remote hosts over TCP/IP, respectively. • DATABASE: This column provides a comma-separated list of the databases for which this rule applies, or the special name all, if the rule applies for all databases. • USER: This column provides a comma-separated list of users for which the rule applies: all for all users or +groupname for users belonging to a specific group. (Groups are covered in the “Group Configuration” section later in this chapter.) • CIDR-ADDRESS: CIDR stands for Classless Inter-Domain Routing. This column lists the addresses for which the rule applies, often with a bit mask. For example, the entry 192.168.0.0/8 means the rule applies for all hosts in the 192 subnetwork. • METHOD: This column specifies how users matching the previous conditions are to be authenticated. There is a wide range of choices. Table 11-3 lists the common options. A standard default configuration line would be something similar to this: TYPE DATABASE USER CIDR-ADDRESS METHOD local all all 127.0.0.1/32 md5 Table 11-3. Common Authentication Methods Method Description trust The user is allowed, with no need to enter any further passwords. Generally, you will not want to use this option except on experimental PostgreSQL systems, although it is a reasonable choice where security isn’t an issue. reject The user is rejected. This can be useful for preventing access from a range of machines, because the rules in the file are processed in order. For example, you could reject all users from 192.168.0.4, but later in the file, accept connection from other machines in the 192.168.0.0/8 subnet. md5 The user must provide an MD5-encrypted password. This is a good choice for many situations. crypt This method is similar to the md5 method for pre-7.2 installations. All new instal- lations should use md5 in preference. password The user must provide a plain-text password. This is not very secure, but useful when you are trying to identify login problems. ident The user is authenticated using the client name from the user’s host operating system. This works with the pg_ident.conf file. MatthewStones_4789C11.fm Page 312 Wednesday, February 23, 2005 6:48 AM CHAPTER 11 ■ POSTGRESQL ADMINISTRATION 313 This allows all local users to access all databases, but the client system must provide the password in an MD5-encoded form. Normally, this is transparent to the user, as the client will determine that the password the client enters needs to be MD5-encoded before being sent to the PostgreSQL server. An alternative would be to replace md5 with trust, which would say that any user who had been able to log in to the local machine was also able to log in to the database, without requiring further authentication. ■Note If you use MD5 authentication, you must ensure that your PostgreSQL users have passwords, or the MD5-authenticated login will fail. Generally, this minimal configuration is fine for local users, but it doesn’t allow any access for users across the network. To do that, we need to add lines to the pg_hba.conf file. Suppose we wanted to allow all users on the subnetwork 192.168.0.* access to all databases, providing they had the appropriate MD5-encoded password. This is probably the most common type of addition needed to the standard configuration file. We would add the following extra line to the pg_hba.conf file: host all all 192.168.0.0/16 md5 Now suppose some additional administrators require access from outside this subnet, but we don’t want to permit ordinary users access. We would add a line to allow members of the PostgreSQL admins group access from anywhere on the 192 subnetwork, like this: host all +admins 192.0.0.0/8 md5 Note that there is additional configuration required to allow remote connections, which must be set in the postmaster.opts file, as explained in the description of that file a bit later in this chapter. The pg_ident.conf File This pg_ident.conf file is used in conjunction with the ident option of pg_hba.conf. This works by determining the username on the machine the client logged in to, and maps that name to a PostgreSQL username. It relies on the Identification Protocol, defined in RFC 1413. We would not generally consider this a very secure method of access control. The postgresql.conf File postgresql.conf is the main configuration file that determines how PostgreSQL operates. The file consists of a large number of lines, each of the form: option_name = value This sets the required behavior for each option. Where the option is a string, the value should be enclosed in single quotes. Numbers do not need to be quoted. Boolean options should be set to either true or false. MatthewStones_4789C11.fm Page 313 Wednesday, February 23, 2005 6:48 AM 314 CHAPTER 11 ■ POSTGRESQL ADMINISTRATION Table 11-4 lists the main options in the postgresql.conf file. Table 11-4. Principal postgresql.conf Options Option Value and Meaning listen_addresses Sets the address on which PostgreSQL accepts connec- tions. This will normally be localhost, but for machines with multiple IP addresses, you may wish to specify a specific IP address. port Sets the port on which PostgreSQL is listening. By default, this is 5432. max_connections Sets the number of concurrent connections allowed. On most operating systems, this will be 100. Increasing this number will increase the system resource overhead; in particular, the amount of shared memory in use will be increased. superuser_reserved_connections Sets the number of connections from the maximum which are reserved for superusers. By default, this is 2. You may wish to increase it to ensure superusers are never prevented from connecting to the database because too many ordinary users are connected. authentication_timeout Defines how long a client has to complete authentication before it is automatically disconnected. By default, this is 60 seconds. You may wish to decrease it if you see many unauthorized people attempting to connect to the database. shared_buffers Sets the number of buffers being used by PostgreSQL. A typical value would be 1000. Decreasing this value saves system resources on a lightly loaded system. Increasing it may improve performance on a heavily used production system. work_mem Tells PostgreSQL how much memory it can use before creating temporary files for processing intermediate results. The default is 1MB. If you have very large tables and plenty of memory, increasing this value may improve performance. log_destination Determines where PostgreSQL logs server messages by providing a comma-separated list of filenames. log_min_messages Sets the level of message that is logged. The options, from most logging down to least logging, are debug5, debug4, debug3, debug2, debug1, info, notice, warning, error, log, fatal, and panic. By default, notice will be used. log_error_verbosity Sets the amount of detail written to the logs. The default is default. Setting this option to terse reduces the amount written. Setting it to verbose writes more information. MatthewStones_4789C11.fm Page 314 Wednesday, February 23, 2005 6:48 AM CHAPTER 11 ■ POSTGRESQL ADMINISTRATION 315 The postmaster.opts File This postmaster.opts file sets the default invocation options for the postmaster program, which is the main PostgreSQL program. Typically, it will contain the full path to the postmaster program, a -D option to set the full path to the principal data directory, and optionally, a -i flag to enable network connections. The postmaster.opts options are listed in Table 11-5. log_connections Logs connections to the database. This is false by default, but if you are running a secure database, you almost certainly need to change this to true. log_disconnections Logs disconnections from the database. search_path Controls the order in which schemas are searched. The default is $user,public. (See the “Schema Management” section later in this chapter.) default_transaction_isolation Sets the default transaction isolation level, which was discussed in Chapter 9. The default is read committed, which is generally a good choice. deadlock_timeout Sets the length of time before the system checks for dead- locks when waiting for a lock on a database table. By default, this is set to 1000 milliseconds. You may want to increase it on a heavily loaded production system. statement_timeout Sets a maximum time, in milliseconds, that any statement is allowed to execute. By default, this is set to 0, which disables this feature. stats_start_collector If set to true, PostgreSQL collects internal statistics, usable by the pg_stat_activity and other statistics views. stats_command_string If set to true, enables the collection of statistics on commands that are currently being executed. datestyle Sets the default date style, which was discussed in Chapter 4. The default is iso, mdy. timezone Sets the default time zone. By default, this is set to unknown, which means PostgreSQL should use the system time zone. default_with_oids Controls whether the CREATE TABLE command defaults to creating tables with OIDs. By default, this is set to true at the time of writing. This option may be required in the future should PostgreSQL default to not creating OIDs but you have an older application which relies on them being present. However, we strongly suggest that you do not assume OIDs are present. Table 11-4. Principal postgresql.conf Options (Continued) Option Value and Meaning MatthewStones_4789C11.fm Page 315 Wednesday, February 23, 2005 6:48 AM 316 CHAPTER 11 ■ POSTGRESQL ADMINISTRATION Here is an example of a postmaster.opts file from Linux, allowing network connections: /usr/local/pgsql/bin/postmaster '-i' '-D' '/usr/local/pgsql/data' And here is a typical Windows file (which would all be on a single line), disallowing remote connections: C:/Program Files/PostgreSQL/8.0.0/bin/postmaster.exe "-D" "C:/Program Files/PostgreSQL/8.0.0/data" Notice the different quoting required on Windows systems. Other PostgreSQL Subdirectories The following are the other subdirectories normally found under the PostgreSQL base installation directory: • The doc directory: This contains the online documentation, and may contain additional documentation for user-contributed additions, depending on your installation choices. • The include and lib directories: These contain the header and library files needed to create and run client applications for PostgreSQL. See Chapters 13 and 14 for details of libpq and ecpg, which use these directories. • The man directory: On Linux (and UNIX) only, these contain the manual pages. Adding this to your MANPATH, (for example, $ export MANPATH=$MANPATH:/usr/local/pgsql/man) will allow you to view the PostgreSQL manual pages using the man command. • The share directory: This contains a mix of configuration sample files, user-contributed material, and time zone files. There is also a list of standard SQL features supported by the current version of PostgreSQL. Table 11-5. postmaster Options Option Description -B nbufs Sets the number of shared memory buffers to nbufs. -d level Sets the level of debug information (level should be a number 1 through 5) written to the server log. -D dir Sets the database directory (/data) to dir. There is no default value. If no -D option is set, the value of the environment variable PGDATA is used. -i Allows remote TCP/IP connections to the database. -l Allows secure database connections using the Secure Sockets Layer (SSL) protocol. This requires the -i option (network access) and support for SSL to have been compiled in to the server. -N cons Sets the maximum number of simultaneous connections the server will accept. -p port Sets the TCP port number that the server should use to listen on. help Gets a helpful list of options. MatthewStones_4789C11.fm Page 316 Wednesday, February 23, 2005 6:48 AM [...]... when we installed PostgreSQL, usually postgres, using the chown command # ls -ld /opt/pgdata drwxr-xr-x 2 root root 40 96 Nov 21 14:07 /opt/pgdata # chown postgres.postgres /opt/pgdata # ls -ld /opt/pgdata drwxr-xr-x 2 postgres postgres 40 96 Nov 21 14:07 /opt/pgdata # Now we are ready to create a PostgreSQL tablespace associated with our new directory We must do this from within the psql program Directories... then we use psql to check the tablespace Finally, we create the new database: MatthewStones_4789C11.fm Page 331 Wednesday, February 23, 2005 6: 48 AM CHAPTER 11 ■ POSTGRESQL ADMINISTRATION # psql -U postgres template1 Welcome to psql 8.0.0, the PostgreSQL interactive terminal Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \g or terminate with semicolon... shown in Figure 11-4 Figure 11-4 Object layout inside the PostgreSQL database server MatthewStones_4789C11.fm Page 329 Wednesday, February 23, 2005 6: 48 AM CHAPTER 11 ■ POSTGRESQL ADMINISTRATION Creating Databases PostgreSQL databases are created within psql with the CREATE DATABASE command, which has the following syntax: CREATE DATABASE dbname [ [ WITH ] [ OWNER [=]owner ] [ TEMPLATE [=] template ] [... 11-11 PostgreSQL’s Handling of Hazardous Events Event PostgreSQL Action Client crash PostgreSQL will roll back any transactions (see Chapter 9) in progress for that client Client network failure PostgreSQL will roll back any transactions in progress for that client Server crash PostgreSQL will roll back incomplete transactions when the server restarts Operating system crash with no data loss PostgreSQL... system files from the command line: # cd /opt/pgdata # ls -l total 8 drwx -2 postgres postgres -rw 1 postgres postgres # 40 96 Nov 27 13:35 17 864 4 Nov 21 14:19 PG_VERSION The rather strange number, 17 864 , is simply a name that PostgreSQL has chosen to use as a directory to store the files The PG_VERSION file is used by PostgreSQL internally to track which version of software was used to create the database... provide a standard script for you Do ensure that the PostgreSQL server gets the opportunity for a clean shutdown whenever the operating system shuts down PostgreSQL Internal Configuration We have now seen how to configure our PostgreSQL server, able to accept the remote connections as required It’s now time to look at the configuration elements of PostgreSQL that are set internally to the server We will... following creates a user, neil, who can create other users and databases, but whose account will expire on December 31, 20 06: CREATE USER neil PASSWORD 'secret' CREATEDB CREATEUSER VALID UNTIL '20 06- 12-31'; Using the createuser Utility PostgreSQL also has a utility, createuser, which we saw briefly in Chapter 3, to help with the creation of PostgreSQL users if you wish to do this from the operating system...MatthewStones_4789C11.fm Page 317 Wednesday, February 23, 2005 6: 48 AM CHAPTER 11 ■ POSTGRESQL ADMINISTRATION Database Initialization When PostgreSQL is first installed, we must arrange for a database to be created We did this back in Chapter 3 by using initdb ■Note Almost all PostgreSQL installations, with the exception of those built from source, arrange for initdb to be called automatically... because PostgreSQL’s default behavior is to create a schema called public and place all the tables in that schema By default, PostgreSQL assumes that it should look for any table your SQL accesses in the public schema This means that users who have no need of schemas can pretty much ignore them 331 MatthewStones_4789C11.fm Page 332 Wednesday, February 23, 2005 6: 48 AM 332 CHAPTER 11 ■ POSTGRESQL ADMINISTRATION... table of the same name owned by neil This is easy to see in pgAdmin III, as shown in Figure 11 -6 Notice that both the schemas schema1 and neil have a table called table11 335 MatthewStones_4789C11.fm Page 3 36 Wednesday, February 23, 2005 6: 48 AM 3 36 CHAPTER 11 ■ POSTGRESQL ADMINISTRATION Figure 11 -6 Two tables with the same name, in the same database This ability to subdivide schemas in a database, both . Wednesday, February 23, 2005 6: 48 AM 314 CHAPTER 11 ■ POSTGRESQL ADMINISTRATION Table 11-4 lists the main options in the postgresql.conf file. Table 11-4. Principal postgresql.conf Options Option. this a very secure method of access control. The postgresql.conf File postgresql.conf is the main configuration file that determines how PostgreSQL operates. The file consists of a large number. present. Table 11-4. Principal postgresql.conf Options (Continued) Option Value and Meaning MatthewStones_4789C11.fm Page 315 Wednesday, February 23, 2005 6: 48 AM 3 16 CHAPTER 11 ■ POSTGRESQL ADMINISTRATION Here