The bytea data type allows storage of binary strings; see Table
5-6.
Table 5-6. Binary String Types
| Type Name |
Storage |
Description |
| bytea |
4 bytes plus the actual binary string |
Variable (not specifically limited) length binary
string |
A binary string is a sequence of octets (or bytes). Binary strings are distinguished from
characters strings by two characteristics: First, binary strings specifically allow storing
octets of zero value and other "non-printable" octets.
Second, operations on binary strings process the actual bytes, whereas the encoding and
processing of character strings depends on locale settings.
When entering bytea values, octets of certain values must be escaped (but all octet values may be escaped) when used as part of a
string literal in an SQL statement. In general, to escape an
octet, it is converted into the three-digit octal number equivalent of its decimal octet
value, and preceded by two backslashes. Some octet values have alternate escape sequences,
as shown in Table
5-7.
Table 5-7. bytea Literal Escaped Octets
| Decimal Octet Value |
Description |
Input Escaped Representation |
Example |
Printed Result |
| 0 |
zero octet |
'\\000' |
SELECT '\\000'::bytea; |
\000 |
| 39 |
single quote |
'\'' or '\\047' |
SELECT '\''::bytea; |
' |
| 92 |
backslash |
'\\\\' or '\\134' |
SELECT '\\\\'::bytea; |
\\ |
Note that the result in each of the examples in Table
5-7 was exactly one octet in length, even though the output representation of the zero
octet and backslash are more than one character. Bytea output octets
are also escaped. In general, each "non-printable"
octet decimal value is converted into its equivalent three digit octal value, and preceded
by one backslash. Most "printable" octets are
represented by their standard representation in the client character set. The octet with
decimal value 92 (backslash) has a special alternate output representation. Details are in Table
5-8.
Table 5-8. bytea Output Escaped Octets
| Decimal Octet Value |
Description |
Output Escaped Representation |
Example |
Printed Result |
| 92 |
backslash |
\\ |
SELECT '\\134'::bytea; |
\\ |
| 0 to 31 and 127 to 255 |
"non-printable"
octets |
\### (octal value) |
SELECT '\\001'::bytea; |
\001 |
| 32 to 126 |
"printable"
octets |
ASCII representation |
SELECT '\\176'::bytea; |
~ |
To use the bytea escaped octet notation, string literals (input
strings) must contain two backslashes because they must pass through two parsers in the PostgreSQL server. The first backslash is interpreted as an
escape character by the string-literal parser, and therefore is consumed, leaving the
characters that follow. The remaining backslash is recognized by the bytea
input function as the prefix of a three digit octal value. For example, a string literal
passed to the backend as '\\001' becomes '\001'
after passing through the string-literal parser. The '\001' is then
sent to the bytea input function, where it is converted to a single
octet with a decimal value of 1.
For a similar reason, a backslash must be input as '\\\\' (or '\\134'). The first and third backslashes are interpreted as escape
characters by the string-literal parser, and therefore are consumed, leaving two backslashes
in the string passed to the bytea input function, which interprets
them as representing a single backslash. For example, a string literal passed to the server
as '\\\\' becomes '\\' after passing
through the string-literal parser. The '\\' is then sent to the bytea input function, where it is converted to a single octet with a
decimal value of 92.
A single quote is a bit different in that it must be input as '\''
(or '\\047'), not
as '\\''. This is because, while the literal parser interprets the
single quote as a special character, and will consume the single backslash, the bytea input function does not
recognize a single quote as a special octet. Therefore a string literal passed to the
backend as '\'' becomes ''' after passing
through the string-literal parser. The ''' is then sent to the bytea input function, where it is retains its single octet decimal value
of 39.
Depending on the front end to PostgreSQL you use, you
may have additional work to do in terms of escaping and unescaping bytea
strings. For example, you may also have to escape line feeds and carriage returns if your
interface automatically translates these. Or you may have to double up on backslashes if the
parser for your language or choice also treats them as an escape character.
The SQL standard defines a different binary string type,
called BLOB or BINARY LARGE OBJECT. The input
format is different compared to bytea, but the provided functions and
operators are mostly the same.