base64模块提供一些函数l来把二进制数据转换为ASCII集, 通常在明文协议的传输中使用.
base64、base32、base16可以分别编码转化8位字节为6位、5位、4位, 允许非ASCII字节以编码为ASCII码的协议中传输, 例如SMTP, “base”值对应是在每一个编码中字母表的长度. 有一些原始编码的url类型会使用略有不同的结果.
简单的文本编码示例如下:
import base64
initial_data = open(__file__, 'rt').read()
encoded_data = base64.b64encode(initial_data)
num_initial = len(initial_data)
padding = { 0:0, 1:2, 2:1 }[num_initial % 3]
print '%d bytes before encoding' % num_initial
print 'Expect %d padding bytes' % padding
print '%d bytes after encoding' % len(encoded_data)
print
print encoded_data
输出显示原来529字节的文本在编码之后被扩展到了708个字节, 从编码过程来看, 每一个24位序列(3个字节)作为输入, 输出时候则增加了4个字节, 最后2个字符”==”, 则是简单的追加, 因为原始字符串的位数不能被24整除.
在标准输出中时没有很多回车府, 但是为了在文档中有好的可读性, 在如下显示中稍作了变化.
$ python base64_b64encode.py
529 bytes before encoding
Expect 2 padding bytes
708 bytes after encoding
IyEvdXNyL2Jpbi9lbnYgcHl0aG9uCiMgZW5jb2Rpbmc
6IHV0Zi04CiMKIyBDb3B5cmlnaHQgKGMpIDIwMDggRG
91ZyBIZWxsbWFubiBBbGwgcmlnaHRzIHJlc2VydmVkL
gojCiIiIgoiIiIKCl9fdmVyc2lvbl9fID0gIiRJZDog
cHltb3R3LnB5IDEyMzkgMjAwOC0wMS0xNiAxMDo1NTo
xOVogZGhlbGxtYW5uICQiCgppbXBvcnQgYmFzZTY0Cg
ppbml0aWFsX2RhdGEgPSBvcGVuKF9fZmlsZV9fLCAnc
nQnKS5yZWFkKCkKCmVuY29kZWRfZGF0YSA9IGJhc2U2
NC5iNjRlbmNvZGUoaW5pdGlhbF9kYXRhKQoKbnVtX2l
uaXRpYWwgPSBsZW4oaW5pdGlhbF9kYXRhKQpwYWRkaW
5nID0geyAwOjAsIDE6MiwgMjoxIH1bbnVtX2luaXRpY
WwgJSAzXQoKcHJpbnQgJyVkIGJ5dGVzIGJlZm9yZSBl
bmNvZGluZycgJSBudW1faW5pdGlhbApwcmludCAnRXh
wZWN0ICVkIHBhZGRpbmcgYnl0ZXMnICUgcGFkZGluZw
pwcmludCAnJWQgYnl0ZXMgYWZ0ZXIgZW5jb2RpbmcnI
CUgbGVuKGVuY29kZWRfZGF0YSkKcHJpbnQKcHJpbnQg
ZW5jb2RlZF9kYXRhCg==
编码的字符串可以转换为原来的格式, 利用反向查询, 把4个字节转换为3个字节. b64decode()函数可以帮助你.
import base64
original_string = 'This is the data, in the clear.'
print 'Original:', original_string
encoded_string = base64.b64encode(original_string)
print 'Encoded :', encoded_string
decoded_string = base64.b64decode(encoded_string)
print 'Decoded :', decoded_string
$ python base64_b64decode.py
Original: This is the data, in the clear.
Encoded : VGhpcyBpcyB0aGUgZGF0YSwgaW4gdGhlIGNsZWFyLg==
Decoded : This is the data, in the clear.
默认的base64字母表可能会使用+和/, 而这些字符可能出现在url中, 因此必须为这些字符指定可选择的编码情况, +由a-来代替, (_)来代替/, 其他字母表还是相同.
import base64
for original in [ '\xfb\xef', '\xff\xff' ]:
print 'Original :', repr(original)
print 'Standard encoding:', base64.standard_b64encode(original)
print 'URL-safe encoding:', base64.urlsafe_b64encode(original)
print
$ python base64_urlsafe.py
Original : '\xfb\xef'
Standard encoding: ++8=
URL-safe encoding: --8=
Original : '\xff\xff'
Standard encoding: //8=
URL-safe encoding: __8=
除了base 64以外, 还有base 32和base 16(16进制)提供函数用于编码数据.
import base64
original_string = 'This is the data, in the clear.'
print 'Original:', original_string
encoded_string = base64.b32encode(original_string)
print 'Encoded :', encoded_string
decoded_string = base64.b32decode(encoded_string)
print 'Decoded :', decoded_string
$ python base64_base32.py
Original: This is the data, in the clear.
Encoded : KRUGS4ZANFZSA5DIMUQGIYLUMEWCA2LOEB2GQZJAMNWGKYLSFY======
Decoded : This is the data, in the clear.
base 16中的函数是以16进制方式工作.
import base64
original_string = 'This is the data, in the clear.'
print 'Original:', original_string
encoded_string = base64.b16encode(original_string)
print 'Encoded :', encoded_string
decoded_string = base64.b16decode(encoded_string)
print 'Decoded :', decoded_string
$ python base64_base16.py
Original: This is the data, in the clear.
Encoded : 546869732069732074686520646174612C20696E2074686520636C6561722E
Decoded : This is the data, in the clear.