Python

Here you can find some implementation details for Python, which usually turn out to be useful.

Data structures

Mutable vs Immutable

A first, crucial distinction is that in Python we both have mutable and immutable objects. The first can change their state. Most built-in types are immutable: int, float, bool, str, tuple, unicode. The built-in mutable types are list, dict, set. Note that identifiers for immutable data types point to the same object, if they have the same value. int(5) is int(5) is True. Note that the bindings are always mutable: an immutable object can be substituted using the same binding. Note that when passing variables to functions, immutable objects are passed by value, while mutable objects are passed by reference: if you pass a list to a function and edit it in the function, the list is changed outside too. Note that mutability is not transitive: if we have a tuple of lists, we can change the content of the lists.

Lists

Lists are dynamically resized: a tuple differs from a list from the fact that it is immutable
You can append(el), remove(value), insert(index, value)
A in array checks for presence of A with complexity O(n)
B=A copies the reference, B=list(A) copies the array, copy.deepcopy(A) does a deep copy
min(A) and max(A) are nice to use
For binary search, you can use bisect.bisect(array, element) which returns the i for which we can split the array and have everything that's less than element to the left
You can use slicing to rotate a list: A[k:]+A[k:]
Slicing can shallow-copy too: B =A[:]
Revert a list with a negative stepsize A[::-1]
List comprehension is multi-level too: [(x,y) for x in A for y in B] works

Binary search

In a sorted array, use bisect.bisect_left to get the first occurrence, bisect.bisect_right for the last+1

Dictionaries

Remove elements using del dictionary[key], or popitem() if you wanna pop a random one
It works like this:
- The hashtable is a contiguous vector of records, whose slots are of 3 types: key+value pairs, virgins, turds
- The table is doubled when the virigns are below 1/3 of the slots
- Note that when you del an element, the table isn't shrinked (as Python assumes you'll fill it with something else)
- dict=dict.copy() gets rid of the turds
- The hashing allows us to compute the table address from the key
- The hash function should be uniform, at least for the table addresses (as to avoid collisions), and usually gets modulo'ed to get the final table address
- Knowing the set S of keys, it is teoretically possible to construct a perfect hash function
- As eliminating collisions is technically impossible, some techniques are used: separated chainring consists in building a dictionary having as values nodes of a linked list that will then be linearly searched (so, if two elements point to the same hash, they will be consequent in the LL), while in open addressing the elements are inserted into the bucket, but they will just be inserted in the first empty address
- Python dictionaries use open hashing based on a primitive polynomial over Z/2
Use collections.defaultdict to avoid incurring in KeyErrors when retrieving new keys
Time-complexity is O(1) for access and addition

LinkedLists

We can consider singly and doubly LinkedLists: the latter have a node.previous attribute too
Nice because the elements do not have to be saved contiguously in the memory
Not allowing random access, many operations are slower
Circularly linked lists are interesting to represent arrays that are natural linked (think about the vertices of a polygon)
Often, using two pointers to traverse a LinkedList can be useful
To find loops in a LL, use two pointers, one traversing pointer = pointer.next.next: if they meet, there's a loop

Stacks/Queues

Both stacks and queues are implemented using LinkedLists, and just restrict the type of operations
You can use collections.deque for a list with O(1) append and retrieval from both sides
To create a queue using two stacks, you can keep a a tail and head, then fill the head inverting the tail when it is empty

Other syntactic sugar

You can use dataclasses.dataclass to create what is basically a struct:

from dataclasses import dataclass
@dataclass
class Rectangle
   # included in sort
    left :int
    bottom :int
    right :int
    # NOT included in sort.
    top :int = field(compare = false)
    # __init__() is autocreated

# Use *args to create
coords = [1,2,3,4]
rec = Rectangle(*coords)

You can use collections.namedtuple to have a named tuple:

from collections import namedtuple
Entry = namedtuple("Entry",['time', 'value'])
e = Entry(time = 1, value = 2)
e[0] == e.time #True

Itertools

Itertools is an awesome library that deals with iterators for efficient looping.

Decorators

WIP

Testing

assert is the simplest operation you can do to test values, which fails with an AssertionError in case of false

unittest has been the standard library for unit tests since 2.1

You put your tests into classes as methods
Then use a series of special assertion methods in the unittest.TestCase.class instead of the built-in assert

import unittest
class TestSum(unittest.TestCase):
    def test_sum(self):
        self.assertEqual(sum([1, 2, 3]), 6, "Should be 6")
    def test_sum_tuple(self):
        self.assertEqual(sum((1, 2, 2)), 6, "Should be 6")

if __name__ == '__main__':
    unittest.main()

You can run tests with python -m unittest test instead of that unittest.main()

Please put your tests into a separate file!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python.md

Python.md

Python

Data structures

Mutable vs Immutable

Lists

Binary search

Dictionaries

LinkedLists

Stacks/Queues

Other syntactic sugar

Itertools

Decorators

Testing

Files

Python.md

Latest commit

History

Python.md

File metadata and controls

Python

Data structures

Mutable vs Immutable

Lists

Binary search

Dictionaries

LinkedLists

Stacks/Queues

Other syntactic sugar

Itertools

Decorators

Testing