Introduction to Python and Jupyter¶

In this module we will be using the Python programming language. This language first appeared in 1991 and was designed by Guido van Rossum.

Python is currently one of the most popular programming languages, and is very widely used in scientific applications (especially those involving data analysis) and also in Artificial Intelligence. It is considered to be one of the best languages to learn as a first programming language, since it supports a wide variety of programming concepts in a farily straightforward way, so is a good way to acquire a general understanding of programming that is transferable to many other languages.

What is a Computer Program?¶

Since this is an advanced computing course, most of you will have a fair idea of what a computer program is. However, let us briefly consider a general definition and a simple example, before moving on to more specific details.

A computer program consists collection of commands, usually stored in a file (or set of files). Rather than typing out the commands each time we want to execute them we simply run the program. Operating systems provide various ways of running programs, such as via entering a command in a terminal window or clicking on a program icon.

The following is an example of a very simple Python program, which, when run, will prompt the user to enter three numbers corresponding to the lengths of the three dimensions of a box, and will then calculate and output the volume of the box:

print( "Welcome to the amazing box volume calculator!" )

# Read in box x, y and z values (as strings)
x = input("Enter the length (in cm) of your box: ")
y = input("Enter the width (in cm) of your box: ")
z = input("Enter the depth (in cm) of your box: ")

# multiply lengths in each dimension to get the volume
volume = int(x) * int(y) * int(z)

# The int function converts a string to an integer
print( "The volume of your box is", volume, "cubic centimeters" )

This program is imperative in style, which means that it consists of a sequence of commands that are to be executed when the program is run. The code illustrates some of the most fundamental ideas in programming. It takes input data (via the input commands), performs some calculation (to find the volume) and outputs (using the print command) a computed value.

The code also illustrates the use of variables (x, y, z and volume), which are assigned values using the = symbol. The value of volume is calculated by multiplying (using the * operator) the three measurement values. Notice that the values of the variables x, y and z will be sequences of characters (called strings ) typed in by the user. These character sequences need to be converted to corresponding numbers (integers) by the int operator.

Question¶

Our general definition characterised a program as a "collection of commands". Do all programs consist of commands? What about definitions, questions, hypotheses? Can these be parts of a program? What about data? Is that part of a program?

What kind of programming language is Python?¶

It is difficult to neatly describe Python as a particular kind of programming language. It is not based on any one dominant idea of how to program, and incorporates a wide variety of features which are also found in many other modern programming languages, such as object orientation, functional programming, exception handling and meta programming.

We can identify a number of strengths and weaknesses of Python in relation to other programming languages:

Strengths¶

Easy to learn the basics
Very widely used
A huge number of 'packages' are available to support specific types of programming
Smooth progression from simple to complex programming
provides convenient constructs for many high-level programming ideas
a good balance between theoretical elegance and practical utility

Thus, Python's strengths are centered around its ease of use and flexibility.

Weaknesses¶

May be too slow or inefficient for certain kinds of application.
Some people don't like the use of indentation to specify program structure.
Does not keep track of or place restrictions on the type of a varialble (e.g. we could have phone_number = 01132431751 and phone_number = "0113 2431751" in the same piece of code and Python would not even warn us that the same variable was used in one place to reference a number and in the other to a string of characters.)
There are a huge number of different ways to do even very simple programming tasks, which can be a bit disorientating.
Does not enforce moduarity (though it does support it). This can potentially lead to maintainability problems, especially in large system.

Most of these issues can be mitigated by careful and systematic programming. Slowness is likely to be the most difficult to overcome.

Slowness can certainly be an issue for certain kinds of program requirements. For instance, if you wanted to program a system that could process mp4 videos and stream them to a display to produce a real-time video, you would probably not use Python for the processing and streaming. In such cases, greater speed can be acheieved by lower level compiled languages, whose code gets converted (by compilation ) into hardware-specific, chip-level instructions that can be executed extremely efficiently.

Question¶

What language might one use to program a video streaming system?

Introduction to Jupyter Notebook¶

What is the notebook style of programming?¶

A notebook interface or computational notebook is a type of programming tool that enables a style of coding known as literate programming.

The idea of this kind of programming is that the purpose, functionality and logic of a program are explained by means of natural language descriptions that are interleaved with the actual source code of the program. In other words, the executuable code of a program is interleaved with documentation of what it is for and how it works.

To support this kind of programming, we need a notebook file format and a programming environment that provides the means for displaying, editing, executing and viewing the output of notebook files. Typically, such a system would be built on top of an existing programming language by: extending its file format to incorporate additional documenting information; and providing a development tool that allows editing and display of this additional information as well editing and execution of program code.

Jupyter Notebook¶

In the case of Jupyter, the notebook is built on top of the Python language. The file format used by Jupyter is IPython NoteBook. Files in this format normally have file names ending in the extension .ipynb and will be referred to as ipynb files.

The content of ipynb files is primarily orginised into two types of cell:

Python code cells contain code that is just like Python code that would be written in a non-notebook programming environment, although some code may contain commands that are specifically designed for use in notebooks (such as for displaying output in particular ways within the notebook)
Markdown cells contain textual information whose layout can be specified by the markdown notation, which enables formatting such as headers and bullet-point lists to be specified. These cells also allow the use of HTML formating and allow hyperlinks and images to be inserted.

ipynb files also contain some other information regarding properties of the file and outputs that have been generated by the code cells, but this is created automatically, not edited by the programmer.

There are actually several varieties of Jupyter notebook programming environment, but they are all very similar both in their appearance and funcitonality. These environments all have the following general properties and capabilities:

The interface is web-based. This means that the user normally interacts with the Jupyter programming environment via a web browser (such as Chrome, Firefox, or Edge).
Code execution and file storage are handled via a server, which is a separate program running either locally on the user's machine (e.g. the Jupyter notebook server that comes with an Anaconda Python installation) or on a remote machine (e.g. provided by a cloud-based service, such as Google's Colab).
Functionality for creating, editing and executing code and markdown cells.
Code cells can be executed either individually, one-at a time in sequence, or all code in the notebook can be executed.
Code output will be displayed below each code cell after it is run, and can be enhanced by various formating and graphical tools.
When markdown cells are executed, they are rendered into the format specified by markdown notations within the cell.
Wide variety of other commands such as: file loading/saving, cutting, pasting, searching, clearing output, restarting are provided by means of menus and hot-keys.

Advantages of the Notebook Style of Programming¶

The main advantages of using a notebook style programing environment include the following:

Notebooks provides a convenient way of planning and documenting code.
Because documentation is within the same file as the code it is less likely to be forgotton or ignored.
The division of code into cells, is well suited to a pipeline style of programming, where computational processing and/or data analysis procedes in a series of steps, and especially where one may want to take stock by conducting analysis and testing between stages of this sequence.
Because program output can also be nicely displayed, the notebook can also function as a report that presents and explains the results that have been obtained from the code.
The ability to include nicely formatted explanations, hyperlinks and images along with the code make notebooks an exellent medium for sharing information, and hence for teaching and learning.

Disadvantages of Notebook Programming¶

Not all programmers are convinced by the charms of notebook programming. There are some disadvantages, including the following:

They only provide a limited subset of the system development functionality that is available in advanced non-notebook integrated evelopment environments (such as Eclipse)
They do not provide good support for developing large systems that involve many complex components. Although it is normal to import many Python modules into a notebook, it is not so usual or convenient to write a system in the form of many notebook files.
They do not promote a good style of structured programming. This is because the linear execution of small program snippets within a notebook does not motivate attention to code structure and modularity. But modularity is vital to ensuring reliability of complex software systems that contain many inter-related components.

It is likely that in the future, some of these disadvantages will be mitigated by combining functionalities of notebook programming with those of the more traditional software devleopment environments.