PostgreSQL
|
PostgreSQL_Logo.png
PostgreSQL is a free object-relational database server (database management system), released under the flexible BSD-style license. It offers an alternative to other open-source database systems (such as MySQL and Firebird), as well as to proprietary systems such as Oracle, Sybase, IBM's DB2 and Microsoft SQL Server.
PostgreSQL's unusual-looking name gives some readers pause in trying to pronounce it, especially those who pronounce SQL as "sequel". PostgreSQL's developers pronounce it "post-gress-Q-L". (Audio sample (http://postgresql.rmplc.co.uk/postgresql.mp3), 5.6k MP3). It is also common to hear it abbreviated as simply "postgres."
Contents |
History
PostgreSQL has had a lengthy evolution, starting with the Ingres project at UC Berkeley. The project lead, Michael Stonebraker had left Berkeley to commercialize Ingres in 1982, but eventually returned to academia. After returning to Berkeley in 1985, Stonebraker started a post-Ingres project to address the problems with contemporary database systems that had become increasingly clear during the early 1980s. The code bases of Postgres and Ingres started (and remain) completely separated.
The resulting project, named Postgres, aimed at introducing the minimum number of features needed to add complete support for types. These features included the ability to define types, but also the ability to fully describe relationships – up until this time widely used but maintained entirely by the user. In Postgres the database "understood" relationships, and could retrieve information in related tables in a natural way using rules.
Starting in 1986 the team released a number of papers describing the basis of the system, and by 1988 the project had a prototype version up and running. The team released version 1 to a small number of users in June 1989, followed by Version 2 with a re-written rules system in June 1990. 1991's Version 3 re-wrote the rules system again, but also added support for multiple storage managers and for an improved query engine. By 1993 a huge number of users existed and began to overwhelm the project with requests for support and features. After releasing a Version 4 -- primarily as a cleanup -- the project ended.
Although the Postgres project had officially ended, the BSD license (under which Berkeley had released Postgres) enabled Open Source developers to obtain copies and to develop the system further. In 1994 two UC Berkeley graduate students, Andrew Yu and Jolly Chen, added a SQL language interpreter to replace the earlier Ingres-based QUEL system, creating Postgres95. The code was subsequently released to the web to find its own way in the world. 1996 saw a re-naming of the project: in order to reflect the database's new SQL query language, Postgres95 became PostgreSQL.
The first PostgreSQL release formed version 6.0. Subsequently a group of database developers and volunteers from around the world, coordinated via the Internet, have maintained the software. Since version 6.0, many subsequent releases have appeared, and many improvements have occurred in the system; on January 19, 2005 version 8.0 became the current release.
Although the license allowed for the commercialization of Postgres, the Postgres code did not develop commercially with the same rapidity as Ingres -- somewhat surprisingly considering the advantages Postgres offered. The main offshoot originated when Paula Hawthorn (an original Ingres team member who moved from Ingres) and Michael Stonebraker formed Illustra Information Technologies to commercialize Postgres.
Description
A cursory examination of PostgreSQL might suggest that the system resembles other database systems. PostgreSQL uses the SQL language to run queries on data. That data exists as a series of tables with foreign keys linking related data together. One might characterise the primary advantage of PostgreSQL over some of its competitors as programmability: PostgreSQL makes it much easier to build real-world applications using data taken from the database.
The SQL data stores simple data types in "flat tables", requiring the user to gather together related information using queries. This contrasts with the way applications and users utilise the data itself: typically in a high-level language with rich data types where all of the related data operates as a complete unit of its own, typically referred to as a record or object (depending on the language).
Converting information from the SQL world into the object-oriented programming world presents difficulties because the two have very different models of data organization. The industry knows this problem as impedance mismatch: mapping from one model to the other may take up to 40% of a project's development time. A number of mapping solutions, typically referred to as object-relational mapping, address the issue, but they tend to cost a lot and have problems of their own, causing poor performance or forcing all data access to take place through the one language that the mapping supports.
PostgreSQL can solve many of these issues directly in the database. PostgreSQL allows the user to define new types based on the normal SQL types, allowing the database itself to understand complex data. For instance, you can define an address
to consist of several strings for things like street number, city and country. From that point on one can easily create tables containing all the fields needed to hold an address with a single line.
PostgreSQL also allows types to include inheritance, one of the major concepts in object-oriented programming. For instance, one could define a post_code
type, and then create us_zip_code
and canadian_postal_code
based on it. Addresses in the database could then take either us_address
or canadian_address
form, and specialized rules could validate the data in each case. In early versions of PostgreSQL, implementing new types required writing C extensions and compiling them into the database server; in version 7.4, it has become much easier to create and use custom types via CREATE DOMAIN
.
The programming of the database itself can profit greatly from the use of functions. Most SQL systems allow users to write a stored procedure, a block of SQL code that other SQL statements can call. However SQL itself remains unsuitable as a programming language, and SQL users can experience great difficulty in constructing complex logic. Worse, SQL itself does not support many of the most basic operations in a programming language, like branching and looping. Instead each vendor has written their own extensions to the SQL language to add these features, and such extensions do not necessarily operate across database platforms.
In PostgreSQL programmers can write such logic in any one of a considerable set of supported languages.
- A built-in language called PL/PgSQL resembles Oracle's procedural language PL/SQL, and offers particular advantages when dealing with query-intensive procedures.
- Wrappers for popular scripting languages such as Perl, Python, Tcl, and Ruby allow harnessing their strengths in string processing and in linking to extensive libraries of outside functions.
- Procedures requiring the high performance provided by compiling complex logic into machine code can utilise C or C++.
- On the more esoteric side, a handler for the R statistical language allows database queries to take advantage of its rich set of statistical functions.
The programmer can insert the code into the server as a function, a small wrapper that makes the code resemble a stored procedure. In this way SQL code can call (for instance) C code and vice-versa.
- Performance improves because the database engine invokes all the logic at one time in one place, reducing the number of round trips between client and server.
- Reliability improves due to the centralisation of data verification code in one place, on the server, without relying on synchronising logic in multiple client applications, perhaps even in multiple programming languages.
- By adding useful abstractions to the server, client code can become much shorter and simpler.
These advantages add up to making PostgreSQL arguably the most advanced database system from a programming perspective, which helps to explain the success of Illustra. Using PostgreSQL can dramatically reduce overall programming time on many projects, with its advantages growing with project complexity.
Features
Some features of PostgreSQL rarely found in other relational databases include:
- User-defined types
- User-defined operators
- Availability of multiple stored procedure languages, including C, PL/Perl (http://www.postgresql.org/docs/current/interactive/plperl.html), PL/python (http://www.postgresql.org/docs/current/interactive/plpython.html), PL/PHP (http://plphp.commandprompt.com/), PL/Java (http://gborg.postgresql.org/project/pljava/projdisplay.php), PL/sh (http://developer.postgresql.org/~petere/pgplsh/), PL/Tcl (http://www.postgresql.org/docs/current/interactive/pltcl.html), PL/R (http://www.joeconway.com/plr/), PL/Ruby (http://raa.ruby-lang.org/list.rhtml?name=pl-ruby), or the native PL/PgSQL
- Support for geographic objects via PostGIS
- Support for IP address (including IPv6), CIDR block, and MAC address data types
- Concurrency managed via a Multi-Version Concurrency Control (MVCC) design, which ensures excellent performance even under heavy concurrent access
- Table inheritance
- Rules — a way of implementing server-side logic that allows the application developer to modify the "query tree" of an incoming query
- Expressional indexes — indexes created on the value of an expression, rather than on a single column or set of columns
- Partial indexes — indexes created on a portion of a table. These can save disk space and improve performance if only part of the table actually requires an index
In addition, PostgreSQL supports almost all the constructs expected from an enterprise-level database, including:
- Referential integrity constraints including foreign key constraints, column constraints, and row checks
- Triggers
- Views
- Outer joins
- Sub-selects
- Transactions
- Strong compliance with the SQL standard (SQL92, SQL:1999)
- Encrypted connections via SSL
- Binary and textual large-object storage
- Online backup
- Domains
- Tablespaces
- Savepoints
- Point-in-time recovery
See also
- List of relational database management systems
- List of object-relational database management systems
- Comparison of relational database management systems
- Comparison of object-relational database management systems
External links
- PostgreSQL Website (http://www.postgresql.org/)
- PostgreSQL Documentation (http://www.postgresql.org/docs/)
- Tuning guide (http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html)
- Annotated postgresql.conf and Global User Configuration (GUC) Guide (http://www.powerpostgresql.com/Downloads/annotated_conf_80.html)
- SourceForge PostgreSQL-related projects (http://sourceforge.net/softwaremap/trove_list.php?form_cat=525)
- PgFoundry PostgreSQL-related projects (http://www.pgfoundry.org/)
- Open Source Database Network (http://osdb.org/)ca:PostgreSQL
cs:PostgreSQL de:PostgreSQL es:PostgreSQL fr:PostgreSQL it:PostgreSQL ja:PostgreSQL nl:PostgreSQL pl:PostgreSQL pt:PostgreSQL ru:PostgreSQL sk:PostgreSQL tr:PostgreSQL uk:PostgreSQL vi:PostgreSQL zh:PostgreSQL