Base64
|
In computing, base64 is a data encoding scheme whereby binary-encoded data is converted to printable ASCII characters. It is defined as a MIME content transfer encoding for use in internet e-mail. The only characters used are the upper- and lower-case Roman alphabet characters (A–Z, a–z), the numerals (0–9), and the "+" and "/" symbols, with the "=" symbol as a special suffix code.
Full specifications for base64 are contained in RFC 1421 and RFC 2045. The scheme is defined only for data whose original length is a multiple of eight bits, a requirement met by most computer file formats. The resultant base64-encoded data has a length that is approximately 33% greater than the original data, and typically appears as seemingly random characters.
To convert data to base64, the first byte is placed in the most significant eight bits of a 24-bit buffer, the next in the middle eight, and the third in the least significant eight bits. If there are fewer than three bytes to encode, the corresponding buffer bits will be zero. The buffer is then used, six bits at a time, most significant first, as indices into the string "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/" and the indicated character output. If there were only one or two input bytes, only the first two or three characters of the output are used and are padded with two or one "=" characters respectively. This prevents extra bits being added to the reconstructed data. The process then repeats on the remaining input data.
For example, the historic Wikipedia slogan (http://en.wikipedia.org/wiki/Wikipedia:Logos_and_slogans),
- Man is distinguished, not only by his reason, but by this singular passion from other animals, which is a lust of the mind, that by a perseverance of delight in the continued and indefatigable generation of knowledge, exceeds the short vehemence of any carnal pleasure.
encoded in base64 is as follows:
TWFuIGlzIGRpc3Rpbmd1aXNoZWQsIG5vdCBvbmx5IGJ5IGhpcyByZWFzb24sIGJ1dCBieSB0 aGlzIHNpbmd1bGFyIHBhc3Npb24gZnJvbSBvdGhlciBhbmltYWxzLCB3aGljaCBpcyBhIGx1 c3Qgb2YgdGhlIG1pbmQsIHRoYXQgYnkgYSBwZXJzZXZlcmFuY2Ugb2YgZGVsaWdodCBpbiB0 aGUgY29udGludWVkIGFuZCBpbmRlZmF0aWdhYmxlIGdlbmVyYXRpb24gb2Yga25vd2xlZGdl LCBleGNlZWRzIHRoZSBzaG9ydCB2ZWhlbWVuY2Ugb2YgYW55IGNhcm5hbCBwbGVhc3VyZS4=
Basic spam scanners which do not decode Base64 messages will often pass messages in Base64 since they appear random enough, or do not contain keywords in the Base64 text to be spam.
Modified Base64 is a data encoding scheme whereby characters above 0x80
(hexadecimal notation) are encoded using printable ASCII characters. It is a variant of base64, and is primarily used for encoding Unicode text into UTF-7 format for use in MIME messages. See UTF-7 for examples.
Modified Base64 is standardized as RFC 2152, A Mail-Safe Transformation Format of Unicode.
The main difference it has versus base64 is that it does not use the "=" symbol for padding, as that character tends to require a fair amount of escaping. Instead, it pads the octet bits with zeros.
See also
External links
- RFC 1421 (Privacy Enhancement for Electronic Internet Mail)
- RFC 2045 (MIME)
- RFC 3548 (The Base16, Base32, and Base64 Data Encodings)
- Source code for the base64 algorithm (http://base64.sourceforge.net/)
- This FireFox extension supports ASCII/Base64 conversions (http://leetkey.mozdev.org)
Resources
- Online Base64 Decoder/Encoder, SourceForge (http://makcoder.sourceforge.net/demo/base64.php)
- On-line Base64 Decoder/Encoder (http://www.opinionatedgeek.com/dotnet/tools/Base64Decode/)
- On-line Base64 Decoder/Encoder (http://www.swesecure.com/?ID=dba1761b-f461-40f1-a0c3-046c9bd87658&CP=encode)
- Base64 Decoder with graphical user interface (Windows) (http://www.microlyrix.com/software/b64dec/)
- Multi-platform Base64 Encoder/Decoder (http://www.fourmilab.ch/webtools/base64/)da:Base64