python核心技术与实战

python在19年就已经在工作中使用了，主要不是主力语言，学的可能不够细致全面，目前所在的公司有用它做bi的数据报表数据拉取处理及几个游戏的后端接口，那就再次学习下这门胶水语言，看极客时间《python核心技术与实战专栏》重学一下

其他python知识点，见notebook

元祖和列表

列表和元组都是有序的，可以存储任意数据类型的集合

列表是动态的，长度可变，可以随意的增加、删减或改变元素。列表的存储空间略大于元组，性能略逊于元组。
元组是静态的，长度大小固定，不可以对元素进行增加、删减或者改变操作。元组相对于列表更加轻量级，性能稍优。

 1l =	[1,	2,	3,	4]
 2l[3]	=	40	#	和很多语⾔类似，python中索引同样从0开始，l[3]表⽰访问列表的第四个元素
 3
 4
 5tup	=	(1,	2,	3,	4)
 6new_tup	=	tup	+	(5,	)	#	创建新的元组new_tup，并依次填充原元组的值
 7
 8
 9# 列表和元组都可以随意嵌套
10l =	[[1,	2,	3],	[4,	5]]	#	列表的每⼀个元素也是⼀个列表
11tup	=	((1,	2,	3),	(4,	5,	6))	#	元组的每⼀个元素也是⼀元组
12
13
14# 互转
15list((1,	2,	3))
16[1,	2,	3]
17tuple([1,	2,	3])
18(1,	2,	3)

常用函数

count(item)表示统计列表/元组中item出现的次数。
index(item)表示返回列表/元组中item第一次出现的索引。
list.reverse()和list.sort()分别表示原地倒转列表和排序（注意，元组没有内置的这两个函数)。
reversed()和sorted()同样表示对列表/元组进行倒转和排序，但是会返回一个倒转后或者排好序的新的列表/元组。

代码性能

1python3 -m timeit 'empty_list=[]'
2python3 -m timeit 'empty_list=list()'

字典、集合

集合并不支持索引操作，因为集合本质上是一个哈希表，和列表不一样.

创建、访问，增删改

 1d1 = {'name': 'jason', 'age': 20, 'gender': 'male'}
 2d2 = dict({'name': 'jason', 'age': 20, 'gender': 'male'})
 3d3 = dict([('name', 'jason'), ('age', 20), ('gender', 'male')])
 4d4 = dict(name='jason', age=20, gender='male') 
 5d1 == d2 == d3 ==d4
 6True
 7 
 8s1 = {1, 2, 3}
 9s2 = set([1, 2, 3])
10s1 == s2
11True
12
13d = {'name': 'jason', 'age': 20}
14d.get('name')
15'jason'
16d.get('location', 'null')
17'null'
18
19d = {'name': 'jason', 'age': 20}
20d['gender'] = 'male' # 增加元素对'gender': 'male'
21d['dob'] = '1999-02-01' # 增加元素对'dob': '1999-02-01'
22d
23{'name': 'jason', 'age': 20, 'gender': 'male', 'dob': '1999-02-01'}
24d['dob'] = '1998-01-01' # 更新键'dob'对应的值 
25d.pop('dob') # 删除键为'dob'的元素对
26'1998-01-01'
27d
28{'name': 'jason', 'age': 20, 'gender': 'male'}
29 
30s = {1, 2, 3}
31s.add(4) # 增加元素 4 到集合
32s
33{1, 2, 3, 4}
34s.remove(4) # 从集合中删除元素 4
35s
36{1, 2, 3}

判断一个元素在不在字典或集合内，我们可以用value in dict/set 来判断

1s = {1, 2, 3} 
21 in s True 
310 in s False 
4
5
6d = {'name': 'jason', 'age': 20} 
7'name' in d True 
8'location' in d False

字典键或值，进行升序或降序排序

1# 字段排序
2d = {'b': 1, 'a': 2, 'c': 10}
3d_sorted_by_key = sorted(d.items(), key=lambda x: x[0]) # 根据字典键的升序排序
4d_sorted_by_value = sorted(d.items(), key=lambda x: x[1]) # 根据字典值的升序排序
5d_sorted_by_key
6[('a', 2), ('b', 1), ('c', 10)]
7d_sorted_by_value
8[('b', 1), ('a', 2), ('c', 10)]

集合排序

1s = {3, 4, 2, 1}
2sorted(s) # 对集合的元素进行升序排序
3[1, 2, 3, 4]

遍历元祖列表

 1def find_product_price(products, product_id):
 2    for id, price in products:
 3        if id == product_id:
 4            return price
 5    return None 
 6     
 7products = [
 8    (143121312, 100), 
 9    (432314553, 30),
10    (32421912367, 150) 
11]

哈希表除了字典本身的结构，会把索引和哈希值、键、值单独分开

 1Indices
 2----------------------------------------------------
 3None | index | None | None | index | None | index ...
 4----------------------------------------------------
 5 
 6Entries
 7--------------------
 8hash0   key0  value0
 9---------------------
10hash1   key1  value1
11---------------------
12hash2   key2  value2
13---------------------
14        ...
15---------------------

深入浅出字符串

Python 中单引号、双引号和三引号的字符串是一模一样的

多行注释

1"""
2注释内容
3xxxx
4"""

字符串切片

1name = 'jason'
2name[0]
3'j'
4name[1:3]
5'as'

Python 的字符串是不可变的（immutable）

1s = ''
2for n in range(0, 100000):
3    s += str(n)

str1 += str2 ，Python 首先会检测 str1 还有没有其他的引用。如果没有的话，就会尝试原地扩充字符串 buffer 的大小，而不是重新分配一块内存来创建新的字符串并拷贝。这样的话，上述例子中的时间复杂度就仅为 O(n) 了 (python2.5+ ,所以用+ 不必担心效率问题)

字符串格式化

1print('no data available for person with id: {}, name: {}'.format(id, name)) # 新的版本
2print('no data available for person with id: %s, name: %s' % (id, name))

输入输出

1open() 函数拿到文件的指针。其中，第一个参数指定文件位置（相对位置或者绝对位置）；第二个参数，如果是 'r'表示读取，如果是'w' 则表示写入，当然也可以用 'rw' ，表示读写都要。a 则是一个不太常用（但也很有用）的参数，表示追加（append），这样打开的文件，如果需要写入，会从原始文件的最末尾开始写入

1with open('in.txt', 'r') as fin:
2    text = fin.read()     #读取全部 
3    # readline() 读一行 ，read 也可以读指定size 
4word_and_freq = parse(text)
5with open('out.txt', 'w') as fout:
6    for word, freq in word_and_freq:
7        fout.write('{} {}\n'.format(word, freq))

JSON 序列化与实战

json.dumps() 这个函数，接受 Python 的基本数据类型，然后将其序列化为 string；
而 json.loads() 这个函数，接受一个合法字符串，然后将其反序列化为 Python 的基本数据类型。

 1import json
 2 
 3params = {
 4    'symbol': '123456',
 5    'type': 'limit',
 6    'price': 123.4,
 7    'amount': 23
 8}
 9 
10params_str = json.dumps(params)
11 
12print('after json serialization')
13print('type of params_str = {}, params_str = {}'.format(type(params_str), params))
14 
15original_params = json.loads(params_str)

如果是文件，可以用以下方式

 1import json
 2 
 3params = {
 4    'symbol': '123456',
 5    'type': 'limit',
 6    'price': 123.4,
 7    'amount': 23
 8}
 9 
10with open('params.json', 'w') as fout:
11    params_str = json.dump(params, fout)
12 
13with open('params.json', 'r') as fin:
14    original_params = json.load(fin)
15 
16print('after json deserialization')
17print('type of original_params = {}, original_params = {}'.format(type(original_params), original_params))

条件、循环

1if condition_1:
2    statement_1
3elif condition_2:
4    statement_2
5else:
6    statement_n

 1for item in <iterable>:
 2	...
 3# 字典遍历
 4d = {'name': 'jason', 'dob': '2000-01-01', 'gender': 'male'}
 5for k in d: # 遍历字典的键
 6    print(k)
 7for v in d.values(): # 遍历字典的值
 8    print(v)
 9for k, v in d.items(): # 遍历字典的键值对
10    print('key: {}, value: {}'.format(k, v))

集合遍历

 1for index, item in enumerate(l):
 2	...
 3
 4for index in range(0, len(l)):
 5  ...
 6
 7  
 8# 多利用 continue，break 来简化代码，方便观看，减少层级 
 9
10
11
12# 性能 range > while 
13#range是c 写的， 因为 i 是整型，是 immutable，i += 1 相当于 i = new int(i + 1)
14for i in range(0, 1000000):
15    pass
16
17i = 0
18while i < 1000000:
19    i += 1

表达式，语法糖

 1# expression1 if condition else expression2 for item in iterable
 2y = [value * 2 + 5 if value > 0 else -value * 2 + 5 for value in x]
 3
 4# 没有else 
 5# expression for item in iterable if condition
 6
 7text = ' Today,  is, Sunday'
 8text_list = [s.strip() for s in text.split(',') if len(s.strip()) > 3]
 9print(text_list)
10['Today', 'Sunday']
11
12
13
14[(xx, yy) for xx in x for yy in y if xx != yy] #等价于下面
15
16l = []
17for xx in x:
18    for yy in y:
19        if xx != yy:
20            l.append((xx, yy))

 1attributes = ['name', 'dob', 'gender']
 2values = [['jason', '2000-01-01', 'male'], 
 3['mike', '1999-01-01', 'male'],
 4['nancy', '2001-02-01', 'female']
 5]
 6 
 7# expected outout:
 8[{'name': 'jason', 'dob': '2000-01-01', 'gender': 'male'}, 
 9{'name': 'mike', 'dob': '1999-01-01', 'gender': 'male'}, 
10{'name': 'nancy', 'dob': '2001-02-01', 'gender': 'female'}]
11
12
13[{ attributes[i]: value[i] for i in range(len(attributes)) } for value in values]
14[dict(zip(attributes, value)) for value in values]

异常处理

能用代码搞定是否异常的，就别乱用异常

 1try:
 2    s = input('please enter two numbers separated by comma: ')
 3    num1 = int(s.split(',')[0].strip())
 4    num2 = int(s.split(',')[1].strip())
 5    ...
 6except (ValueError, IndexError) as err:
 7    print('Error: {}'.format(err))
 8    
 9print('continue')
10
11try:
12    s = input('please enter two numbers separated by comma: ')
13    num1 = int(s.split(',')[0].strip())
14    num2 = int(s.split(',')[1].strip())
15    ...
16except ValueError as err:
17    print('Value Error: {}'.format(err))
18except IndexError as err:
19    print('Index Error: {}'.format(err))
20except:
21    print('Other error')
22finally:
23    f.close()
24print('continue')

自定义异常

 1class MyInputError(Exception):
 2    """Exception raised when there're errors in input"""
 3    def __init__(self, value): # 自定义异常类型的初始化
 4        self.value = value
 5    def __str__(self): # 自定义异常类型的 string 表达形式
 6        return ("{} is invalid input".format(repr(self.value)))
 7    
 8try:
 9    raise MyInputError(1) # 抛出 MyInputError 这个异常
10except MyInputError as err:

1try:
2    data = json.loads(raw_data)
3    ....
4except JSONDecodeError as err:
5    print('JSONDecodeError: {}'.format(err))

自定义函数

1def name(param1, param2, ..., paramN):
2    statements
3    return/yield value # optional

（比如 C 语言）不一样的是，def 是可执行语句，这意味着函数直到被调用前，都是不存在的。当程序调用函数时，def 语句才会创建一个新的函数对象，并赋予其名字。

主程序调用函数时，必须保证这个函数此前已经定义过，不然就会报错，函数内部调用就没关系

1my_func('hello world')
2def my_func(message):
3    print('Got a message: {}'.format(message))
4# 输出
5# NameError: name 'my_func' is not defined

函数的嵌套

第一，函数的嵌套能够保证内部函数的隐私。内部函数只能被外部函数所调用和访问，不会暴露在全局作用域，因此，如果你的函数内部有一些隐私数据（比如数据库的用户、密码等），不想暴露在外，那你就可以使用函数的的嵌套，将其封装在内部函数中，只通过外部函数来访问。比如：

1def connect_DB():
2    def get_DB_configuration():
3        ...
4        return host, username, password
5    conn = connector.connect(get_DB_configuration())
6    return conn

第二，合理的使用函数嵌套，能够提高程序的运行效率

类型检测只检测一次

 1def factorial(input):
 2    # validation check
 3    if not isinstance(input, int):
 4        raise Exception('input must be an integer.')
 5    if input < 0:
 6        raise Exception('input must be greater or equal to 0' )
 7    ...
 8 
 9    def inner_factorial(input):
10        if input <= 1:
11            return 1
12        return input * inner_factorial(input-1)
13    return inner_factorial(input)
14 
15 
16print(factorial(5))

不能在函数内部随意改变全局变量的值

1MIN_VALUE = 1
2MAX_VALUE = 10
3def validation_check(value):
4    ...
5    MIN_VALUE += 1
6    ...
7validation_check(5)
8# UnboundLocalError: local variable 'MIN_VALUE' referenced before assignment

Python 的解释器会默认函数内部的变量为局部变量，但是又发现局部变量 MIN_VALUE 并没有声明，因此就无法执行相关操作。所以，如果我们一定要在函数内部改变全局变量的值，就必须加上 global 这个声明

1MIN_VALUE = 1
2MAX_VALUE = 10
3def validation_check(value):
4    global MIN_VALUE
5    ...
6    MIN_VALUE += 1
7    ...
8validation_check(5)

局部变量名和全局重复，那么就使用局部变量

1MIN_VALUE = 1
2MAX_VALUE = 10
3def validation_check(value):
4    MIN_VALUE = 3
5    ...

对于嵌套函数来说，内部函数可以访问外部函数定义的变量，但是无法修改，若要修改，必须加上 nonlocal 这个关键字

 1def outer():
 2    x = "local"
 3    def inner():
 4        nonlocal x # nonlocal 关键字表示这里的 x 就是外部函数 outer 定义的变量 x
 5        x = 'nonlocal'
 6        print("inner:", x)
 7    inner()
 8    print("outer:", x)
 9outer()
10# 输出
11#inner: nonlocal
12#outer: nonlocal
13  
14#. 如果没加nonlocal 那么输出 
15
16#inner: nonlocal
17#outer: local

闭包

闭包其实和刚刚讲的嵌套函数类似，不同的是，这里外部函数返回的是一个函数，而不是一个具体的值。返回的函数通常赋于一个变量，这个变量可以在后面被继续执行调用

 1def nth_power(exponent):
 2    def exponent_of(base):
 3        return base ** exponent
 4    return exponent_of # 返回值是 exponent_of 函数
 5 
 6square = nth_power(2) # 计算一个数的平方
 7cube = nth_power(3) # 计算一个数的立方 
 8square
 9# 输出
10<function __main__.nth_power.<locals>.exponent(base)>
11 
12cube
13# 输出
14<function __main__.nth_power.<locals>.exponent(base)>
15 
16print(square(2))  # 计算 2 的平方
17print(cube(2)) # 计算 2 的立方
18# 输出
194 # 2^2
208 # 2^3

 1def nth_power_rewrite(base, exponent):
 2    return base ** exponent
 3# 不使用闭包
 4res1 = nth_power_rewrite(base1, 2)
 5res2 = nth_power_rewrite(base2, 2)
 6res3 = nth_power_rewrite(base3, 2)
 7...
 8 
 9# 使用闭包 ，更加简洁
10square = nth_power(2)
11res1 = square(base1)
12res2 = square(base2)
13res3 = square(base3)
14...

闭包常常和装饰器（decorator）一起使用

匿名函数

1lambda argument1, argument2,... argumentN : expression
2square = lambda x: x**2
3square(3)
4 
59

第一，lambda 是一个表达式（expression），并不是一个语句（statement）。

因此，lambda 可以用在一些常规函数 def 不能用的地方

1[(lambda x: x*x)(x) for x in range(10)]
2# 输出
3[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

lambda 可以被用作某些函数的参数，而常规函数 def 也不能

1l = [(1, 20), (3, 0), (9, 10), (2, -1)]
2l.sort(key=lambda x: x[1]) # 按列表中元祖的第二个元素排序
3print(l)
4# 输出
5[(2, -1), (3, 0), (9, 10), (1, 20)]

第二，lambda 的主体是只有一行的简单表达式，并不能扩展成一个多行的代码块

Python 之所以发明 lambda，就是为了让它和常规函数各司其职：lambda 专注于简单的任务，而常规函数则负责更复杂的多行逻辑。关于这点，Python 之父 Guido van Rossum 曾发了一篇文章解释

函数作用（lambda也有这些作用）

减少代码的重复性；
模块化代码。

函数 map(function, iterable) 的第一个参数是函数对象，第二个参数是一个可以遍历的集合，它表示对 iterable 的每一个元素，都运用 function 这个函数

1python3 -mtimeit -s'xs=range(1000000)' 'map(lambda x: x*2, xs)'
22000000 loops, best of 5: 171 nsec per loop
3 
4python3 -mtimeit -s'xs=range(1000000)' '[x * 2 for x in xs]'
55 loops, best of 5: 62.9 msec per loop
6 
7python3 -mtimeit -s'xs=range(1000000)' 'l = []' 'for i in xs: l.append(i * 2)'
85 loops, best of 5: 92.7 msec per loop

map是最快的，因为是c写的

filter(function, iterable)

1l = [1, 2, 3, 4, 5]
2new_list = filter(lambda x: x % 2 == 0, l) # [2, 4]

reduce(function, iterable)

1l = [1, 2, 3, 4, 5]
2product = reduce(lambda x, y: x * y, l) # 1*2*3*4*5 = 120

对一个字典，根据值进行由高到底的排序

1sorted(d.items(),key=lambda x:x[1],reverse=True)

面向对象

 1class Document():
 2    WELCOME_STR = 'Welcome! The context for this book is {}.'
 3
 4    def __init__(self, title, author, context):
 5        print('init function called')
 6        self.title = title
 7        self.author = author
 8        self.__context = context # __ 开头的属性是私有属性
 9
10 
11    def intercept_context(self, length):
12        self.__context = self.__context[:length]
13      
14       # 类函数
15    @classmethod
16    def create_empty_book(cls, title, author):
17        return cls(title=title, author=author, context='nothing')
18    
19    # 成员函数
20    def get_context_length(self):
21        return len(self.__context)
22    
23    # 静态函数
24    @staticmethod
25    def get_welcome(context):
26        return Document.WELCOME_STR.format(context)

继承

 1class Entity():
 2    def __init__(self, object_type):
 3        print('parent class init called')
 4        self.object_type = object_type
 5    
 6    # 子类要用 get_context_length ，必须重写，不然报错
 7    def get_context_length(self):
 8        raise Exception('get_context_length not implemented')
 9    
10    def print_title(self):
11        print(self.title)
12 
13class Document(Entity):
14    def __init__(self, title, author, context):
15        print('Document class init called')
16        Entity.__init__(self, 'document')
17        self.title = title
18        self.author = author
19        self.__context = context
20    
21    def get_context_length(self):
22        return len(self.__context)

抽象类

抽象类是一种特殊的类，它生下来就是作为父类存在的，一旦对象化就会报错。同样，抽象函数定义在抽象类之中，子类必须重写该函数才能使用。相应的抽象函数，则是使用装饰器 @abstractmethod 来表示。

 1from abc import ABCMeta, abstractmethod
 2 
 3class Entity(metaclass=ABCMeta):
 4    @abstractmethod
 5    def get_title(self):
 6        pass
 7 
 8    @abstractmethod
 9    def set_title(self, title):
10        pass
11 
12class Document(Entity):
13    def get_title(self):
14        return self.title
15    
16    def set_title(self, title):
17        self.title = title
18 
19document = Document()
20document.set_title('Harry Potter')
21print(document.get_title())
22 
23entity = Entity()
24 
25########## 输出 ##########
26 
27Harry Potter
28 
29---------------------------------------------------------------------------
30TypeError                                 Traceback (most recent call last)
31<ipython-input-7-266b2aa47bad> in <module>()
32     21 print(document.get_title())
33     22 
34---> 23 entity = Entity()
35     24 entity.set_title('Test')
36 
37TypeError: Can't instantiate abstract class Entity with abstract methods get_title, set_title

其他

https://github.com/cr-mao/learn-ai/tree/main/pycode

Python核心技术与实战

元祖和列表

代码性能

字典、集合

创建、访问，增删改

字典键或值，进行升序或降序排序

集合排序

深入浅出字符串

字符串切片

Python 的字符串是不可变的（immutable）

字符串格式化

输入输出

JSON 序列化与实战

条件、循环

异常处理

自定义异常

自定义函数

函数的嵌套

闭包

匿名函数

面向对象

继承

抽象类

其他

版权

Python核心技术与实战

元祖和列表

代码性能

字典、集合

创建、访问，增删改

字典键或值，进行升序或降序排序

集合排序

深入浅出字符串

字符串切片

Python 的字符串是不可变的（immutable）

字符串格式化

输入输出

JSON 序列化与实战

条件、循环

异常处理

自定义异常

自定义函数

函数的嵌套

闭包

匿名函数

面向对象

继承

抽象类

其他

版权

相关文章