|
The object-relational database management system now known as PostgreSQL
(and briefly called Postgres95) is derived from the POSTGRES package written at the University of California at
Berkeley. With over a decade of development behind it, PostgreSQL
is the most advanced open-source database available anywhere, offering multiversion
concurrency control, supporting almost all SQL constructs (including subselects, transactions,
and user-defined types and functions), and having a wide range of language bindings available
(including C, C++, Java, Perl, Tcl, and Python).
Implementation of the POSTGRES DBMS
began in 1986. The initial concepts for the system were presented in The
design of POSTGRES and the definition of the
initial data model appeared in The POSTGRES data model. The design of the rule system at
that time was described in The design of the POSTGRES rules system. The rationale and architecture of
the storage manager were detailed in The
design of the POSTGRES storage system.
Postgres has undergone several major releases since
then. The first "demoware" system became operational in
1987 and was shown at the 1988 ACM-SIGMOD Conference. Version
1, described in The
implementation of POSTGRES, was released to a few
external users in June 1989. In response to a critique of the first rule system (A
commentary on the POSTGRES rules system), the rule
system was redesigned (On
Rules, Procedures, Caching and Views in Database Systems) and Version 2 was released
in June 1990 with the new rule system. Version 3 appeared in 1991 and added support for
multiple storage managers, an improved query executor, and a rewritten rewrite rule system.
For the most part, subsequent releases until Postgres95
(see below) focused on portability and reliability.
POSTGRES has been used to implement many different
research and production applications. These include: a financial data analysis system, a jet
engine performance monitoring package, an asteroid tracking database, a medical information
database, and several geographic information systems. POSTGRES
has also been used as an educational tool at several universities. Finally, Illustra
Information Technologies (later merged into Informix, which is now owned by IBM.) picked up the code and commercialized it. POSTGRES
became the primary data manager for the Sequoia 2000 scientific computing project in late 1992.
The size of the external user community nearly doubled during 1993. It became
increasingly obvious that maintenance of the prototype code and support was taking up large
amounts of time that should have been devoted to database research. In an effort to reduce
this support burden, the Berkeley POSTGRES project
officially ended with Version 4.2.
In 1994, Andrew Yu and Jolly Chen added a SQL language interpreter to POSTGRES. Postgres95 was
subsequently released to the Web to find its own way in the world as an open-source
descendant of the original POSTGRES Berkeley code.
Postgres95 code was completely ANSI C and trimmed in
size by 25%. Many internal changes improved performance and maintainability. Postgres95 release 1.0.x ran about 30-50% faster on the Wisconsin
Benchmark compared to POSTGRES, Version 4.2. Apart from bug
fixes, the following were the major enhancements:
-
The query language PostQUEL was replaced with SQL
(implemented in the server). Subqueries were not supported until PostgreSQL (see below), but they could be imitated in Postgres95 with user-defined SQL
functions. Aggregates were re-implemented. Support for the GROUP BY query clause was
also added. The libpq interface remained available for C programs.
-
In addition to the monitor program, a new program (psql)
was provided for interactive SQL queries using GNU Readline.
-
A new front-end library, libpgtcl, supported Tcl-based clients. A sample shell, pgtclsh,
provided new Tcl commands to interface Tcl programs
with the Postgres95 backend.
-
The large-object interface was overhauled. The Inversion large objects were the only
mechanism for storing large objects. (The Inversion file system was removed.)
-
The instance-level rule system was removed. Rules were still available as rewrite
rules.
-
A short tutorial introducing regular SQL features as
well as those of Postgres95 was distributed with the
source code
-
GNU make (instead of BSD
make) was used for the build. Also, Postgres95 could be
compiled with an unpatched GCC (data alignment of
doubles was fixed).
By 1996, it became clear that the name "Postgres95"
would not stand the test of time. We chose a new name, PostgreSQL,
to reflect the relationship between the original POSTGRES
and the more recent versions with SQL capability. At the same
time, we set the version numbering to start at 6.0, putting the numbers back into the
sequence originally begun by the Berkeley POSTGRES project.
The emphasis during development of Postgres95 was on
identifying and understanding existing problems in the backend code. With PostgreSQL, the emphasis has shifted to augmenting features and
capabilities, although work continues in all areas.
Major enhancements in PostgreSQL include:
-
Table-level locking has been replaced by multiversion concurrency control, which
allows readers to continue reading consistent data during writer activity and enables
hot backups from pg_dump while the database stays
available for queries.
-
Important backend features, including subselects, defaults, constraints, and
triggers, have been implemented.
-
Additional SQL92-compliant language features have been
added, including primary keys, quoted identifiers, literal string type coercion, type
casting, and binary and hexadecimal integer input.
-
Built-in types have been improved, including new wide-range date/time types and
additional geometric type support.
-
Overall backend code speed has been increased by approximately 20-40%, and backend
start-up time has decreased by 80% since version 6.0 was released.
|