timerring

Python Underlying Mechanism

December 23, 2024 · 7 min read · Page View:
Tutorial
Python
If you have any questions, feel free to comment below.

This article will review the underlying mechanism of Python, including data types and derivation operations.

CH8 Underlying Mechanism #

Data Type Implementation #

Strange List #

list_1 = [1, [22, 33, 44], (5, 6, 7), {"name": "Sarah"}]
  • Shallow copy
# list_3 = list_1          # Error!
list_2 = list_1.copy()     # Or list_1[:] \ list(list_1) can also implement shallow copy
  • Operations on the shallow copy of the two lists
list_2[1].append(55)

print("list_1:  ", list_1)
# list_1:   [1, [22, 33, 44, 55], (5, 6, 7), {'name': 'Sarah'}]
print("list_2:  ", list_2)
# list_2:   [1, [22, 33, 44, 55], (5, 6, 7), {'name': 'Sarah'}]

List Implementation #

Concept of reference array

The elements inside the list can be scattered in memory

The list actually stores the addresses of these elements, and the addresses are stored in a continuous manner.

Shallow copy copies the addresses of the elements.

list_1 = [1, [22, 33, 44], (5, 6, 7), {"name": "Sarah"}]
list_2 = list(list_1)   # Shallow copy  The same functionality as list_1.copy()

(1)New elements

list_1.append(100)
list_2.append("n")

print("list_1:  ", list_1)
# list_1:   [1, [22, 33, 44], (5, 6, 7), {'name': 'Sarah'}, 100]
print("list_2:  ", list_2)
# list_2:   [1, [22, 33, 44], (5, 6, 7), {'name': 'Sarah'}, 'n']

(2)Modify elements, pay attention to the difference between modifying mutable or immutable types.

  • Lists and dictionaries, which are mutable types, content changes, but the address does not change.
  • While tuples, numbers, and strings, which are immutable types, content changes, and the address changes.
list_1[0] = 10
list_2[0] = 20

print("list_1:  ", list_1)
# list_1:   [10, [22, 33, 44], (5, 6, 7), {'name': 'Sarah'}, 100]
print("list_2:  ", list_2)
# list_2:   [20, [22, 33, 44], (5, 6, 7), {'name': 'Sarah'}, 'n']

(3)Operations on list elements

list_1[1].remove(44)
list_2[1] += [55, 66]

print("list_1:  ", list_1)
# list_1:   [10, [22, 33, 55, 66], (5, 6, 7), {'name': 'Sarah'}, 100]
print("list_2:  ", list_2)
# list_2:   [20, [22, 33, 55, 66], (5, 6, 7), {'name': 'Sarah'}, 'n']

Because the operation is on the list, and the original list maps the address, after modifying the element, the address is mapped, so the modification of list1 and 2 is the same

(4)Operations on tuple elements

list_2[2] += (8,9)

print("list_1:  ", list_1)
# list_1:   [10, [22, 33, 55, 66], (5, 6, 7), {'name': 'Sarah'}, 100]
print("list_2:  ", list_2)
# list_2:   [20, [22, 33, 55, 66], (5, 6, 7, 8, 9), {'name': 'Sarah'}, 'n']

Tuples are immutable! They are equivalent to adding a tuple (5, 6, 7, 8, 9), and list2 points to this tuple.

(5)Operations on dictionary elements

list_1[-2]["age"] = 18

print("list_1:  ", list_1)
# list_1:   [10, [22, 33, 55, 66], (5, 6, 7), {'name': 'Sarah', 'age': 18}, 100]
print("list_2:  ", list_2)
# list_2:   [20, [22, 33, 55, 66], (5, 6, 7, 8, 9), {'name': 'Sarah', 'age': 18}, 'n']

Deep Copy #

After shallow copy

  • Operations on immutable elements (numbers, strings, tuples) are effective
  • Operations on mutable elements (lists, sets) cause some confusion

Introducing deep copy

  • Deep copy copies all related elements at all levels, completely separating them, clearly distinguishing between them, avoiding the above issues
import copy

list_1 = [1, [22, 33, 44], (5, 6, 7), {"name": "Sarah"}]
list_2 = copy.deepcopy(list_1)
list_1[-1]["age"] = 18
list_2[1].append(55)

print("list_1:  ", list_1)
# list_1:   [1, [22, 33, 44], (5, 6, 7), {'name': 'Sarah', 'age': 18}]
print("list_2:  ", list_2)
# list_2:   [1, [22, 33, 44, 55], (5, 6, 7), {'name': 'Sarah'}]

Dictionary Implementation #

Implement value storage and access through sparse arrays

Dictionary Creation Process

  • Create a sparse array (N » n)
d = {}
  • Step 1: Calculate the hash value of the key through hash()
print(hash("python"))
# -4771046564460599764
print(hash((1,2)))
# 3713081631934410656

# Adding a key-value pair, first calculate the hash value of the key hash("age")
d["age"] = 18    
print(hash("age")) 
  • Step 2: Determine its position in the sparse array based on the calculated hash value, and if there is a hash value collision, there is a corresponding method to resolve the conflict.
  • Step 3: Store the value at that position

Summary

(1)Dictionary data type, through space for time, implements fast data lookup, which also means that the space utilization efficiency of the dictionary is low.

(2)Because the order of the hash value corresponding position may be different from the order of the key in the dictionary, the dictionary appears to be unordered.

Compact String #

  • Implement string storage through compact arrays, data is stored in memory in a continuous manner, more efficient, and saves space.

  • As a sequence type, why is the list implemented using reference arrays, while the string is implemented using compact arrays? The list can change, so it is not convenient to reserve space.

Mutable and Immutable Types #

  1. Immutable types: numbers, strings, tuples
  • The content remains unchanged throughout its lifecycle, in other words, if it is changed, it is no longer itself (the id changes).
  • The += operation on immutable objects actually creates a new object

Tuples are not always immutable,if a tuple contains a mutable type, then the tuple can still change.

t = (1,[2])
t[1].append(3)
# (1, [2, 3])
  1. Mutable types: lists, dictionaries, sets
  • id remains unchanged, but the content can change
  • The += operation on mutable objects actually modifies the original object in place

A Few Examples of List Operations #

Deleting a Specific Element in a List #

  • Method 1 Existence Operation Deletion

Disadvantage: Each existence operation requires traversing the list from the beginning, searching, and being inefficient

alist = ["d", "d", "d", "2", "2", "d" ,"d", "4"]
s = "d"
while True:
    if s in alist:
        alist.remove(s)
    else:
        break
print(alist)
# ['2', '2', '4']
  • Method 2 Delete all elements at once

First, alist is being deleted, but the index s is in order, so there may be a phenomenon where some elements are skipped, but the deletion is still performed in the order of scanning from the list head.

alist = ["d", "d", "d", "2", "2", "d" ,"d", "4"]
for s in alist:
    if s == "d":
        alist.remove(s)      # remove(s) Remove the first occurrence of the element in the list
print(alist)
# ['2', '2', 'd', 'd', '4']
  • Solution: Use negative indexing

Negative indexing scans in reverse order, ensuring that each traversal is the list head, and the deletion is also the list head.

alist = ["d", "d", "d", "2", "2", "d" ,"d", "4"]
for i in range(-len(alist), 0):
    if alist[i] == "d":
        alist.remove(alist[i])      # remove(s) Remove the first occurrence of the element in the list
print(alist)
# ['2', '2', '4']

Creating a Multi-Dimensional List #

ls = [[0]*10]*5
ls[0][0] = 1
# [[1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
#  [1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
#  [1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
#  [1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
#  [1, 0, 0, 0, 0, 0, 0, 0, 0, 0]]

Because the four lists below are copies of the first list, so if the first list changes, the following ones will also change.

More Concise Syntax #

Parsing Syntax #

ls = [[0]*10 for i in range(5)]
ls[0][0] = 1
# [[1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
#  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
#  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
#  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
#  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]

List Comprehension #

[expression **for value in iterable** if condition]

# Equivalent to the following code
result = []
for value in iterale:
    if condition:
        result.append(expression)
squares = [i**2 for i in range(1,21) if i%2 == 1]
print(squares) 

Support multiple variables

x = [1, 2, 3]
y = [1, 2, 3]

results = [ i*j for i,j in zip(x, y)]
results
# [1, 4, 9]

Support nested loops

colors = ["black", "white"]
sizes = ["S", "M", "L"]
tshirts = ["{} {}".format(color, size) for color in colors for size in sizes]
tshirts
# ['black S', 'black M', 'black L', 'white S', 'white M', 'white L']

Dictionary Comprehension #

squares = {i: i**2 for i in range(3)}
for k, v in squares.items():
    print(k, ":  ", v)
# 0 :  0
# 1 :  1
# 2 :  4

Set Comprehension #

squares = {i**2 for i in range(10)}
squares
# {0, 1, 4, 9, 16, 25, 36, 49, 64, 81}

Generator Expression #

squares = (i**2 for i in range(10))
squares
# <generator object <genexpr> at 0x000001DB37A58390>

Conditional Expression #

expr1 if condition else expr2

n = -10
if n >= 0:
    x = n
else:
    x = -n

# which is equivalent to
x = n if n>= 0 else -n
# 10

Related readings


<< prev | Python Files... Continue strolling Python... | next >>

If you find this blog useful and want to support my blog, need my skill for something, or have a coffee chat with me, feel free to: