Introduction
This page is a draft
I am going to go for a Raymond Hettinger style presentation, https://www.cs.odu.edu/~tkennedy/cs330/f20/Public/languageResources/#python-programming-videos.
These materials are web-centric (i.e., do not need to be printed and are available at https://www.cs.odu.edu/~tkennedy/python-workshop).
Who am I?
I have taught various courses, including:
- CS 300T - Computers in Society
- CS 333 - Programming and Problem Solving
- CS 330 - Object Oriented Programming and Design
- CS 350 - Introduction to Software Engineering
- CS 410 - Professional Workforce Development I
- CS 411W - Professional Workforce Development II
- CS 417 - Computational Methods & Software
Most of my free time is spent writing Python 3 and Rust code, tweaking my Vim configuration, or learning a new (programming) language. My current language of interest is Rust (at the time of writing).
Referenced Courses & Materials
I will reference materials (e.g., lecture notes) and topics from from CS 330, CS 350, CS 411W, and CS 417.
- CS 330 - Object Oriented Programming & Design
- CS 350 - Introduction to Software Engineering
- CS 417 - Computational Methods & Software
I will also reference a couple examples from the previous:
The Broad Strokes
This workshop is intended as discussion on how to write Rust code that makes use of:
Tentative Topics
I will focus on:
- Debugging options in python ( A language that promotes rapid development is usually hard to debug. How can we do it in python? ) Note: pdb
- Documenting code
- Advanced tutorials for modern development : Classes, Polymorphism, Interfaces, etc.
- Structuring large python codebases
-
Profiling python,engineering best practices - Lambda functions, usage, syntax, and expressions
- Multithreading/Concurrent python
Classes and OOP will take a while (the topic is quite vast). I will also discuss testing python code (with unit testing and integration testing) and code coverage, along with tox for basic configuration management.
I will try to fit in a few of the remaining topics:
- Correct usage of the .loc and .iloc functionality
- How to use NumPy
- Python for Machine learning, Tensorflow
- Data and workflow management
- Pandas Dataframe usage
- How do we use RDD and DataFrame.
- Implementing cryptographic algorithms
I should be able to fit in some discussion of NumPy.
Code Documentation
These notes are based on my CS 330 (Object Oriented Programming and Design) and CS 417/517 (Computation Methods) notes, https://www.cs.odu.edu/~tkennedy/cs330/f20/Public/codeDocumentation/index.html.
Most of your code has probably had quite a fewiin-line comments. Inline comments are not the focus of this discussion. The focus of thisS discussion is documentation of classes, functions, and methods.
A Few Starting Examples
I work in few different languages. Throughout my
- C++ code you will find Doxygen style comments.
- Java code you will find Javadoc style comments.
- Python code you will find Pydoc style comments.
- Rust code you will find Rustdoc style comments.
You have definitely been told to "comment your code" in the past, but (probably) in a less formal fashion.
Let us start with a few selected documentation examples from my CS 330 and CS 417 notes.
C++
Doxygen can be used for C++. Consider the following Doxygen Example:
/**
* Retrieve the value stored in three selected Cells
*
* @param cell1Id numeric id representing the 1st desired cell
* @param cell2Id numeric id representing the 2nd desired cell
* @param cell3Id numeric id representing the 3rd desired cell
*
* @return value stored in the Cell
*
* @pre (cell1Id > 0 && cell1Id < 10) &&
* (cell2Id > 0 && cell2Id < 10) &&
* (cell3Id > 0 && cell3Id < 10)
*/
CellTriple get3Cells(int cell1Id, int cell2Id, int cell3Id) const;
Java
Javadoc can be used for Java. Consider the following Javadoc Example:
/**
* Multi-thread Coin Flip.
*
* @param numTrials # flips to simulate
* @param numThreads number of threads to use
*
* @return Completed FlipTasks
*
* @throws InterruptedException if a thread is stopped prematurely
*/
public static FlipTask[] multiThread(long numTrials, int numThreads)
throws InterruptedException
Python
Pydoc or Sphinx can be used for Python. Consider the following Pydoc Example:
def parse_raw_temps(original_temps: TextIO,
step_size: int=30, units: bool=True) -> Iterator[Tuple[float, List[float]] ]:
"""
Take an input file and time-step size and parse all core temps.
:param original_temps: an input file
:param step_size: time-step in seconds
:param units: True if the input file includes units and False if the file
includes only raw readings (no units)
:yields: A tuple containing the next time step and a List containing _n_
core temps as floating point values (where _n_ is the number of
CPU cores)
"""
I prefer the Sphinx/Google style for Python.
def parse_raw_temps(original_temps: TextIO,
step_size: int=30, units: bool=True) -> Iterator[Tuple[float, List[float]] ]:
"""
Take an input file and time-step size and parse all core temps.
Args:
original_temps: an input file
step_size: time-step in seconds
units: True if the input file includes units and False if the file
includes only raw readings (no units)
Yields:
A tuple containing the next time step and a List containing _n_
core temps as floating point values (where _n_ is the number of
CPU cores)
"""
Rust
///
/// Take a room and change the flooring
///
/// # Arguments
///
/// * `original` - House to change
///
/// # Returns
///
/// House with the updated flooring
///
fn upgrade_flooring(original: &House) -> House {
//...
}
Rust and Python have similar documentation styles (give or take some markdown
formatting). Since we only cover small snippets of Rust in this course (for
context), we will forgo a complete
Rustdoc discussion.
Writing Good Documentation
All code should be properly and fully documented using a language appropriate comment style. All functions (including parameters and return types) must be documented.
Documentation for a New Function
Suppose we have just finished writing a quick program to simulate a trick coin (i.e., a coin where heads and tails are not equally probable).
def one_flip(p):
return True if random.random() < p else False
def main():
num_flips = 8;
for _i in range(0, num_flips):
if one_flip(0.7):
print("Heads")
else:
print("Tails")
if __name__ == "__main__":
main()
The one_flip
function needs a description.
def one_flip(p):
"""
Simulate a single coin flip.
"""
What does p
represent? Does it represent the probability of heads or tails?
def one_flip(p):
"""
Simulate a single coin flip.
Args:
p: probability of heads in the range [0, 1]
"""
Now what about the return? We know that bool
means a true
or false
. Which
one do I get for heads? Let us add an @return
.
/**
* Simulate a single coin flip.
*
* @param p probability of heads
*
* @return true if the result is heads and false if the result is tails
*/
bool one_flip(double p);
def one_flip(p):
"""
Simulate a single coin flip.
Args:
p: probability of heads in the range [0, 1]
Returns:
True if the result is heads and False if the result is tails
"""
There is no more ambiguity or guesswork. Both p
and the possible return
values are documented.
Type Hints
I am a stickler for type hints...
def one_flip(p: float) -> bool:
"""
Simulate a single coin flip.
Args:
p: probability of heads in the range [0, 1]
Returns:
True if the result is heads and False if the result is tails
"""
Object Oriented
We need to discuss the rules of a class checklist.
C++ | Java | Python 3 | Rust |
---|---|---|---|
Default Constructor | Default Constructor | __init__ |
new() or Default trait |
Copy Constructor | Clone and/or Copy Constructor | __deepcopy__ |
Clone trait |
Destructor | |||
finalize (deprecated/discouraged) | __del__ |
Drop trait |
|
Assignment Operator (=) | |||
Accessors (Getters) | Accessors (Getters) | Accessors (@property ) |
Accessors (Getters) |
Mutators (Setters) | Mutators (Setters) | Setter (@attribute.setter ) |
Mutators (setters) |
Swap | |||
Logical Equivalence Operator (==) | equals | __eq__ |
std::cmp::PartialEq trait |
Less-Than / Comes-Before Operator (<) | hashCode | __hash__ |
std::cmp::PartialOrd trait |
std::hash (actual hashing)
|
hashCode | __hash__ |
std::hash::Hash trait |
Stream Insertion Operator (<<) | toString | __str__ |
std::fmt::Display trait |
__repr__ |
std::fmt::Debug trait |
||
begin() and end()
|
iterator |
__iter__ |
iter() and iter_mut()
|
Whenever Python code is written, the first function most people write is usually
__init__
... since it serves as a constructor to initialize the fields (data
members) of each new object. For now... let us focus on three methods:
-
__str__
- generates a human readable string for output. -
__repr--
- generates a complete string for debugging, often in the form of a string that fully describes an object. -
__eq__
- compares two objects, returningTrue
if they are equal. The objects need not be of the same type.
Tic-Tac-Toe Example
The code snippets in this section are part of a larger Tic-Tac-Toe example. The full source code can be found in this workshop's Git repository.
Let us start with the Player class. Note that the code is fully documented with pydoc documentation and type hints.
Note:
-
The use of
class Player(object):
is a holdover from Python 2.* In modern Python 3, it should not be used. The(object)
should be omitted. The line should beclass Player:
. -
The
Player
class as written violates the MVC (Model-View-Controller) design pattern and S.O.L.I.D. ThePlayer
class should ony handle representing a single player. All user interaction should be handled outside the class.
class Player(object):
"""
This is more a Player interface than a Player class.
<p>
However, such distinctions and discussions belong in
the OOP and Inheritance Modules
"""
PROMPT_MSG = "Enter your desired move (1-9): "
"""
Message used to prompt a human player for a move.
"""
@staticmethod
def is_generic(possible_cylon: "Player") -> bool:
"""
Checks whether a player is a placeholder or
an actual player.
Args:
possible_cylon (Player): player whose humanity is in question
Returns:
True if the player is a Cylon
"""
# print(REFERENCE_CYLON)
return possible_cylon == REFERENCE_CYLON
def __init__(self, n: str = "I. C. Generic"):
"""
Create a Player with a selected name.
Args:
n: desired name
"""
self._name = n
self._symbol = '?' # testing caught this
def get_name(self) -> str:
"""
Retrieve name.
Returns:
player name
"""
return self._name
def set_name(self, n: str):
"""
Set player name.
@param n new name
@pre (n.size() > 0)
"""
self._name = n
def next_move(self) -> str:
"""
Retrieve the next move.
@return board cell id representing the selected move
@throws IOException if the move can not be retreived from the player.
"""
choice = int(input(self._name + ", " + Player.PROMPT_MSG))
return choice
def is_human(self) -> bool:
"""
Is this a Human Player?
In this discussion, always yes :(
Returns:
True if the player is a human
"""
return True
def is_computer(self):
"""
Is this a Computer Player?
In this discussion, always no :(
Returns:
True if the player is a Cylon
"""
return False
def get_symbol(self) -> str:
"""
Retrieve player symbol to be used
for marking moves.
Returns:
current player symbol
"""
return self._symbol
def set_symbol(self, new_symbol: str):
"""
Change the player symbol.
Args:
new_symbol: new character to be used by the player
"""
self._symbol = new_symbol
def __eq__(self, rhs):
if not isinstance(rhs, self.__class__):
return False
return self._name == rhs._name
def __hash__(self):
return hash(self._name)
def __str__(self):
"""
Generate a player string, but only the name.
"""
return self._name
def __deepcopy__(self, memo):
"""
Create a new duplicate Player.
"""
cpy = Player(self._name)
cpy.set_symbol(self._symbol)
return cpy
REFERENCE_CYLON = Player()
"""
A Player that serves as a sentinal value or placeholder.
"""
There are a few interesting mechanics...
- a decorator (i.e.,
@staticmethod
) - constants (i.e.,
PROMPT_MESSAGE
andREFERENCE_CYLON
) -
__eq__
,__hash__
, str, and
deepcopy` methods
Tic-Tac-Toe Tests
Part of writing "good" code (in any language) involves testing. Test Driven Development (TDD) involves writing tests alongside implementation. In theory:
-
The interface for a module, class, or function is defined and documented. A stub is then without any implementation.
-
A test suite is written. The tests are then run. They should all fail.
-
The actual implementation is written.
For object oriented code, I generally use the mutator-accessor
strategy.
Let us examine the Unit Test Suite for the Player
class.
from hamcrest import *
import unittest
from examples.player import Player
import copy
class TestPlayer(unittest.TestCase):
"""
1 - Does this piece of code perform the operations
it was designed to perform?
2 - Does this piece of code do something it was not
designed to perform?
1 Test per mutator
"""
def setUp(self):
self.tom = Player("Tom")
self.a_cylon = Player()
self.the_doctor = Player("The Doctor")
self.tom.set_symbol('X')
def test_player_default_constructor(self):
self.assertTrue(Player.is_generic(self.a_cylon))
assert_that(self.a_cylon.get_symbol(), equal_to('?'))
assert_that(hash(self.a_cylon), is_not(hash(self.tom)))
assert_that(self.a_cylon, is_not(equal_to(self.tom)))
# Hand wave... These are not the cylons you are looking for.
assert_that(self.a_cylon.is_human(), is_(True))
assert_that(self.a_cylon.is_computer(), is_(False))
def test_player_constructor(self):
self.assertEqual("Tom", str(self.tom))
assert_that(str(self.tom), equal_to("Tom"))
assert_that(hash(self.tom), is_not(hash(self.the_doctor)))
assert_that(self.tom, is_not(equal_to(self.the_doctor)))
assert_that(self.tom.is_human(), is_(True))
assert_that(self.tom.is_computer(), is_(False))
def test_set_symbol(self):
old_hash_code = hash(self.tom)
assert_that(self.tom.get_symbol(), is_('X'))
assert_that(hash(self.tom), is_(old_hash_code))
self.tom.set_symbol('O')
assert_that(self.tom.get_symbol(), is_('O'))
assert_that(hash(self.tom), is_(old_hash_code))
def test_set_name(self):
old_hash_code = hash(self.the_doctor)
assert_that(self.the_doctor.get_name(), is_("The Doctor"))
assert_that(hash(self.the_doctor), is_(old_hash_code))
self.the_doctor.set_name("David Tennant")
assert_that(self.the_doctor.get_name(), is_("David Tennant"))
assert_that(hash(self.the_doctor), is_not(old_hash_code))
self.the_doctor.set_name("Mat Smith")
assert_that(self.the_doctor.get_name(), is_("Mat Smith"))
assert_that(hash(self.the_doctor), is_not(old_hash_code))
self.the_doctor.set_name("Peter Capaldi")
assert_that(self.the_doctor.get_name(), is_("Peter Capaldi"))
assert_that(hash(self.the_doctor), is_not(old_hash_code))
self.the_doctor.set_name("Jodie Whittaker")
assert_that(self.the_doctor.get_name(), is_("Jodie Whittaker"))
assert_that(hash(self.the_doctor), is_not(old_hash_code))
# No clone function, can't test equals
def test_clone(self):
the_original = copy.deepcopy(self.the_doctor)
assert_that(hash(self.the_doctor), equal_to(hash(the_original)))
assert_that(self.the_doctor, equal_to(the_original))
assert_that(self.the_doctor.get_symbol(),
equal_to(the_original.get_symbol()))
the_original.set_name("William Hartnell")
assert_that(hash(self.the_doctor),
is_not(equal_to(hash(the_original))))
assert_that(self.the_doctor, is_not(equal_to(the_original)))
@unittest.skip("can not test")
def test_next_move(self):
# Can not test due to hardcoded System.in use in Player.next_move
pass
Shapes Example
Draft
"""
This module provides the Shape class and related constants which serve as the
base for other (specialized) shapes.
"""
import abc
WIDTH_LABEL = 12 # Label Output Width
WIDTH_VALUE = 24 # Value Output Width
STR_FMT = f"{{:<{WIDTH_LABEL}}}:{{:>{WIDTH_VALUE}}}\n"
FPT_FMT = f"{{:<{WIDTH_LABEL}}}:{{:>{WIDTH_VALUE}.4f}}\n"
class Shape(metaclass=abc.ABCMeta):
"""
Shape in a 2-D Cartesian Plane
"""
@property
@abc.abstractmethod
def name(self) -> str:
"""
Provide read-only access to the name attribute.
Raises:
NotImplemented Error if not overridden by subclass
"""
raise NotImplementedError()
@abc.abstractmethod
def area(self) -> float:
"""
Compute the area
Raises:
NotImplemented Error if not overridden by subclass
"""
raise NotImplementedError()
@abc.abstractmethod
def perimeter(self) -> float:
"""
Compute the perimeter
Raises:
NotImplemented Error if not overridden by subclass
"""
raise NotImplementedError()
@abc.abstractmethod
def __deepcopy__(self, memo):
"""
Return a new duplicate Shape
Raises:
NotImplemented Error if not overridden by subclass
"""
raise NotImplementedError()
@abc.abstractmethod
def __str__(self) -> str:
"""
Print the shape
Raises:
NotImplemented Error if not overridden by subclass
"""
return STR_FMT.format("Name", self.name)
import copy
from shapes.shape import (Shape, FPT_FMT)
class Square(Shape):
"""
A Rectangle with 4 Equal Sides
"""
def __init__(self, side=1):
"""
Construct a Square
"""
self._side = side
@property
def name(self) -> str:
"""
Provide read-only access to the name attribute.
"""
return "Square"
@property
def side(self):
return self._side
@side.setter
def side(self, some_value):
self._side = some_value
def area(self):
"""
Compute the area
"""
return self._side ** 2.0
def perimeter(self):
"""
Compute the perimeter
"""
return 4 * self._side
def __deepcopy__(self, memo):
"""
Return a new duplicate Shape
"""
return Square(copy.deepcopy(self.side))
def __str__(self):
"""
Print the Square
"""
return (super().__str__()
+ FPT_FMT.format("Side", self.side)
+ FPT_FMT.format("Perimeter", self.perimeter())
+ FPT_FMT.format("Area", self.area()))
"""
This module provides factory utilities for creating shapes. This includes
recording which Shape types are available.
"""
import copy
from shapes.circle import Circle
from shapes.square import Square
from shapes.triangle import (Triangle, RightTriangle, EquilateralTriangle)
_KNOWN_SHAPES = {
"Triangle": (
Triangle(),
lambda a, b, c: Triangle(a, b, c)
),
"Right Triangle": (
RightTriangle(),
lambda base, height: RightTriangle(base, height)
),
"Equilateral Triangle": (
EquilateralTriangle(),
lambda side: EquilateralTriangle(side)
),
"Square": (
Square(),
lambda side: Square(side)
),
"Circle": (
Circle(),
lambda radius: Circle(radius)
)
} # _Dictionary_ of known shapes
def create(name):
"""
Create a Shape
Args:
name: the shape to be created
Returns:
A shape with the specified name or null if no matching shape is found
"""
if name in _KNOWN_SHAPES:
return copy.deepcopy(_KNOWN_SHAPES[name][0])
return None
def create_from_dictionary(name, values):
"""
Create a Shape
Args:
name: the shape to be created
values: dictionary of values corresponding to the data needed
to inialize a shape
Returns:
A shape with the specified name or null if no matching shape is found
"""
if name in _KNOWN_SHAPES:
return _KNOWN_SHAPES[name][1](**values)
return None
def is_known(name):
"""
Determine whether a given shape is known
Args:
name: the shape for which to query
"""
return name in _KNOWN_SHAPES
def list_known():
"""
Print a list of known Shapes
"""
return "\n".join([f" {name:}" for name in _KNOWN_SHAPES])
def number_known():
"""
Determine the number of known Shapes
"""
return len(_KNOWN_SHAPES)
Shapes Driver
#! /usr/bin/env python3
# Programmer : Thomas J. Kennedy
import json
import pickle
import sys
from shapes import *
from shapes import shape_factory as ShapeFactory
PROGRAM_HEADING = ("Objects & Inheritance: 2-D Shapes",
"Thomas J. Kennedy") # Program Title
def main():
"""
The main function. In practice I could name this
anything. The name main was selected purely
out of familiarity.
The "if __name__" line below determines what runs
"""
if len(sys.argv) < 2:
print("No input file provided.")
print("Usage: {:} input_file".format(*sys.argv))
exit(1)
shapes_filename = sys.argv[1]
print("-" * 80)
for line in PROGRAM_HEADING:
print(f"{line:^80}")
print("-" * 80)
# Examine the ShapeFactory
print("~" * 38)
print("{:^38}".format("Available Shapes"))
print("~" * 38)
print(ShapeFactory.list_known())
print("-" * 38)
print("{:>2} shapes available.".format(ShapeFactory.number_known()))
print()
# The list needs to be intialzed outside the "with" closure
shapes = list()
with open(shapes_filename, "r") as shapes_in:
for line in shapes_in:
# Split on ";" and Strip leading/trailing whitespace
# And Unpack the list
name, values = [part.strip() for part in line.split(";")]
values = json.loads(values)
shapes.append(ShapeFactory.create_from_dictionary(name, values))
# Remove all `None` entries with a list comprehension
shapes = [s for s in shapes if s is not None]
# Print all the shapes
print("~" * 38)
print("{:^38}".format("Display All Shapes"))
print("~" * 38)
for shp in shapes:
print(shp)
out_filename = "coolPickles.dat"
with open(out_filename, "wb") as pickle_file:
# LOL Nope
# for s in shapes:
# pickle.dump(s, pickle_file)
# One line, full data structure
pickle.dump(shapes, pickle_file)
with open(out_filename, "rb") as pickle_file:
rebuilt_shapes = pickle.load(pickle_file)
# Print all the rebuilt shapes
print("~" * 38)
print("{:^38}".format("Display Re-Built Shapes"))
print("~" * 38)
for shp in rebuilt_shapes:
print(shp)
print("~" * 38)
print("{:^38}".format("Display Largest Shape (Area)"))
print("~" * 38)
largest_shape = max(rebuilt_shapes, key=lambda shape: shape.area())
print(largest_shape)
print("~" * 38)
print("{:^38}".format("Display Smallest Shape (Perimeter)"))
print("~" * 38)
smallest_shape = min(rebuilt_shapes, key=lambda shape: shape.perimeter())
print(smallest_shape)
if __name__ == "__main__":
try:
main()
except FileNotFoundError as err:
print(err)
Directory Structure
drwxrwxr-x htmlcov
-rw-rw-r-- inputShapes.txt
-rw-rw-r-- MANIFEST
-rw-rw-r-- runTests.sh
-rw-rw-r-- setup.py
drwxrwxr-x shapes
drwxrwxr-x tests
-rw-rw-r-- tox.ini
Threads & Processes
This section will focus on a single example.
Draft
- GIL
- Threads vs Processes
- Futures
- `ProcessPoolExecutor
- Simple timer with
datetime