In this lesson we explore some of the most fundamental aspects of programming and see how to write simple pieces of Python code that manipulate data items by applying a variety of computational operators. Since Python is a large an complex language only high-level concepts, core programming constructs and a selection of significant concepts. Students should refer to other sources to broaden their knowledge of the wide range of data types and operations provided by Python.
To be able to identify and describe data types and their role in programming.
Be able to write code using Python's basic data types and operations on these types.
Before, considering particular data types a few other prelimary concepts will need to be understood. We first introduce the idea of a variable and the assignment operator, which are central to the way data is referred to in most programming languages. We will need to use these in order to provide code examples of how data is manipulated by a Python program.
The concept of a variable is key to the understanding most programming languages. Variables are words in a program’s code that are used to stand in place of some value. They may be thought of as referring to or naming their value. A variable can consist of any sequence of letters and digits and can also contain the
underscore symbol _
. However, the first symbol of a variable cannot be a digit.
Words used as variables are usually chosen by the programmer so as to give a mnemonic clue about the type of value referred to (e.g. variable referring to a number might be num
). A
The value of a variable could be a number or a string (i.e. a sequence of characters) or some more complex data item (maybe a list or an image). Unlike names in ordinary language, variables can change their value. This happens when certain commands are executed.
Variables are given values using the =
symbol. Giving a variable a value is usually called assignment.
Here is a simple example:
number_A = 4
number_B = 7
# set the variable, total, to the sum of the values stored in these two variables
total = number_A + number_B
# print the value of total
print( total )
It is important to realise that the variable names
(in this case number_A
, number_B
and total
) have no significance the the execution of the program appart from them being names associated with some
value. Hence, we would get exactly the same affect running the following code:
cachucath = 4
cachucwn = 7
# set the variable, omemiserum, to the sum of the values stored in these two variables
omemiserum = cachucath + cachucwn
# print the value of total
print( omemiserum )
Although the effect of running the code of either Example 1 or Example 2 is the same, most programmers would find Example 2 far less appealing. Why is this?
Although =
is used to symbolise assignment in many programming languages,
some consider this to be inappropriate, because the =
symbol is traditionally
used to signify equality , but assignment is very different
from equality (both in its ordinary meaning and in mathematics). In some
programming languages (e.g. Pascal) assignment is written using the symbols :=
, with
=
being used for the equality relation.
To learn more about the concept of assignment and its notation in various
programming languages, you could take a look at the
Wikipedia article on assignment).
Programmers who are only familiar with imperative programming languages, such as Python (and indeed nearly all the most widely used language), often assume that the use of variables and assignment must be an essential feature of any programming language. But is assignment essential? Are there any languages that do not have an assignment operation?
In order to manipulate data, a programming language provides a variety of operations that can be applied to data items to produce new data items. You will need to understand the terminology used to describe such operations and be able to recognise the different ways in which they occur within code.
The the terms operator, method and function all describe kinds of manipulation that can be applied to data. Although conceptually similar they have different connotations and correspond to different forms of syntax in Python:
operator : this term typically refers to the most basic types of manipulation, such as
mathematical operations.
These are often represented by specific symbols, such as +
, *
, =
, ...
function : this term general refers to an operation that is specified using the syntax
func(arg1, ..., argN)
where func
is a word describing the function. In some cases there can be no aguments ( for example print()
--- if
given no arguments, the print
function will just output
a newline character), and, as we shall see later, Python provides some more elaborate ways that we can specify the arguments. Most langauges, including Python, not-only provide a large number of built-in functions but
also enable the programmer to define new functions.
method : The term method is used in Object-Oriented programming langauges (including Python)
to descrbe a kind of function that is intimately connected with
some specific type of data that has been defined as a class.
The syntax for a method application takes the form:
data_item.method(arg1, .., argN)
Here, data_item
item will be an item of a certain class of data
and the method
should be a method that has been defined for that class.
You should be aware that the words 'operator' and 'function' not always used with the precise
meanings just given. They are sometimes used in a general sense that includes all three types
of manipulation. The term procedure is also often used as more-or-less equivalent to function.
(Historically, in programming the word 'function' was typically used for an operation that returns a
result, whereas 'procedure' meant an operation that produced some effect without returning
a value. This distinction is not usually made in Python. In fact in in Python
every function/procedure returns some value. If it ends without a return
statement
the value None
will be returned.)
You will not need to fully understand the way classes and methods can be defined until later in the module. However, some very common and useful built-in data operations are provided in the form of methods by Python, so you need to recognise the syntax and understand the effect of certain method calls. You will see some examples soon, when we consider the string data type.
Corresponding to the informal idea that data comes in various different types, nearly all programming languages employ a technical notion of a data type. In this technical sense each data type is associated with specific ways of representing and storing data items of that type, and also with particular operations that can be applied to data of that type.
The most fundamental types of data entity used by Python (sometimes called primitive types) are the following:
True
and False
)NoneType
(The type of the special value None
used to indicate the absence of a normal value)Python also provides several kinds of structured data type that enable one to represent data consisting of combinations of data components. In particular, we have the following very general and very useful types:
We shall also see later that the keyword class
provides
a means of defining aribitrarily complex data types (i.e. classes ) incorporating
multiple data components into one object
.
The words "class" and "type" (or data type) mean almost the same thing.
Typically "type" is more general, whereas "class" was traditionally used to
describe only complex data types defined within an object-oriented programmng
language. But in Python3 there is no real distinction, and even primitive
data types can also be referred to as classes. However, as we shall see
below, within the actual syntax of Python, the words class
and type
do have different meanings.
The keyword class
used to define a new class/type of data object,
whereas the built-in function type( ... )
is used to find the class/type
of its argument. Whe shall look at these later in the lesson.
Python provides three different basic types for representing numbers.
int
: an int
corresponds integer valued number of
unspecified length e.g. -17, -1, 0, 2, 5, 10, 42, 15647989
float
: a float can hold floating point number allowing
the approximate representation of real numbers.
complex
: Python can handle complex numbers, which consist
of a real and an imaginary part.In this course we will not be concerned with complex
numbers, which are
mainly used in physics, engineering and certain branches of mathematics.
However, we will certainly be using int
and float
type numbers, so
it is important to understand the difference between them, and the various
operators that can be applied to them.
int
values (integers) are basically whole numbers,
including negative numbers and 0
.
Many kinds of data come in the form of integers, for example the
numbers of packets of each type of bisucuit on a supermarket shelf, or
USB ports on a service robot's back end. Integer values are also
commonly used to control computations, which very often involve some
kind of counting or numerical conditions.
For most purposes you can think of a float
as a decimal number.
This is how it will usually appear in a program and how it will usually
be represented in a data file. But the actual way floating point numbers are
stored in a running Python program is much more complicated, and not covered
in this module. In fact, in most kinds of programming, it is rare to have direct references to specific floating point numbers in the code. However, many kinds of
data consist of decimal numbers, so we shall often be working with programs that
read in such information and manipulate it in the form of float
type values.
int
and float
data¶As one would expect, the representation of a particular int
in Python is just
a sequence of decimal digits, such as 31726
. And the representation of a float
is the same but with a decimal point somewhere in the number, such as
317.26
.
It is important to note that while a sequence of digits, say 123
represents
an int
, the sequence 123.0
represents a float
. Nevertheless, Python's
equality operator ==
is defined such that 123 == 123.0
does return True
.
The following table summarises the most common mathematical operators used in Python. To denote such fundamental operations, Python (linke most other programming langauge) uses symbols that are similar to those employed in paper-and-pencil mathematics:
Operator | Operation |
---|---|
x + y |
Addition of x and y |
x - y |
Subtraction of y from x |
x * y |
Multiplication of x and y |
x / y |
True division of x by y . The result is always a floating point number |
x // y |
Floor division of x by y (explained below) |
x % y |
Modulo. The result of this operation is the remainder when x is divided by y |
-x |
The negative of x |
x**y |
Has the value of x raised to the power of y — i.e. xy |
Mathematical operations are very often used to compute new values that are then assigned to a variable, which will store the value that has been computed. For example:
edge = 57
cube_volume = edge**3
As one would expect, brackets are often required to determine the order in which operations should be carried out:
(2 * 3) + 4
does not give the same value as
2 * (3 + 4)
You should all be very familiar with the operators of basic mathematics, so we shall just consider examples using a couple of the less familiar operators
In the mathematics of integers, the value of n modulo m is the smallest non-negative integer x such that n - x is an integer multiple of m. In other words m * k = n - x for some integer k. Or, more loosely but intuitively, we can say that n modulo m is the remainder obtained when n is divided by m.
The modulo operator in mathematics is well-known but mainly studied in relation to quite abstract theory of numbers. However, it has many computational applications, and in fact is useful in many data converson operations.
%
¶Suppose we have a number t representing the length of a time period in seconds, but want to display it in the form of m minutes and s seconds. Then the number of seconds s, that are not incorporated within the minutes, will be t modulo 60, since there are 60 seconds in each minute.
t = 357
s = t % 60
m = (t - s) / 60
print( m, "minutes, and", s, "seconds")
int
vs float
types¶If you look carefully at the ouput from the code just above, you will notice
a difference between the way that the minutes and seconds are diplayed.
The number of seconds is simply expressed as 57
, an integer. But in the case of the minutes, we see 5.0
; so the number 5 is expressed as if
it were a decimal, even though it is a whole number. Why is this?
The answer is simply that in Python3 the /
operation always returns
a float
type number. This may seem odd in the case where we divide
an int
by another int
which divides exactly into it. However, it does
mean that any expression of the form x / y
will return a float
, and
this consistency is generally thought to be preferable to having an expression
that could sometimes return an int
and at other times a float
, which
could potentially give rise to hard-to-spot bugs.
Of course, Python provides any easy way to convert between int
and float
numbers.
We use the type names in the form of an operator, which transforms its
argument into the specified type:
int( x )
will give the int
corresponding to x
, which can be a float
.
If the value of x
is a positve float
the int(x)
is the largest int
that is
less than or equal to that float. Thus int( 5.7 )
has the value 5
.
float( x )
will give the float
corresponding to x
, which can be an int
.
Thus float( 5 )
, will give the value 5.0
.
These kinds of operation are often called casting. One may say that an int
is cast
into a float
(or vice versa ). We shall later see how str
can be used
as a casting operator to create strings from numbers.
Some languages, will automatically cast data
items from one type to another depending on context.
In Python we have to explicitly
use a function to get a value of a different type. Having said that, many functions
are defined so that they can operate on different types of data (especially int
s
and float
s), which has a similar effect to automatic casting.
What is the value of int( -1.7 )
?
int
¶In the previous example we saw how, because the result of the '/' operation is always a float
,
the number of minutes was displayed as a float
(5.0
) instead of an int
. We can easily
rectify this by using the int
casting operator.
t = 357
s = t % 60
m = int( (t - s) / 60 )
print( m, "minutes, and", s, "seconds")
We are using Python3, but in Python2 the behaviour of /
when applied to
int
numbers is very different.
In Python2, in the case where x
and y
are int
s,
the expression x / y
always returns an int
, even when
y
does not divide exactly into x
. In such cases the result will always be
rounded downwards to the highest integer value that is less than or equal
to the actual floating point value of x
divided by y
.
Thus, in Python2, 3/4
will return the value 0
and, perhaps even more
surprisingly, -3/4
will return -1
(since the result is always rounded
downwards). To get a float result for a division in Python2,
at least one of the two numbers operated on must be a float
. Thus,
3.0/4
and 3/4.0
both return 0.75
.
One could argue that it is reasonable,
that an operation on int
s should return an int
.
However, it is not clear why we should expect this
Moreover, the Python2 behaviour of /
did cause
a lot of confusion for beginner Python programmers and gave rise to many
bugs. Hence, when Python3 was designed, it was decided that x / y
should
always return a float. It is noteworthy that small but fundamental details
can have great significance in the design of a programming langauge and
can give rise to a great deal of analysis, discussion and contention.
The /
in Python is a very good example of this, as you will see if you
read Python Enhancement Proposal PEP 238,
which gives very detailed motivation and specification of the
change made to the meaning of /
when going from Python2 to Python3.
The so-called "floor division" operator x // y
gives the largest whole number that is equal to or less than the real number x / y
. Thus, for example, 7 // 3
gives 2
. Nevertheless, the value returned is not always of type int
. In fact, it will be an int
only if both numbers it operates on are int
s. If either or both is a float
the result will be a float. Thus 7 // 3.0
and 6.0 // 3
both give 2.0
.
Using the //
operator, the conversion from a duration in seconds to one in minutes and seconds can
be given in a more succinct and clear form. Because x // y
gives an int
if both x
and y
are int
s, there is no need to transform a float
to an int
:
t = 357
s = t % 60
m = t // 60
print( m, "minutes, and", s, "seconds")
Given that int(-1.5)
is -1
, one might expect that -3 // 2
would also have that value.
However, the value of -3 // 2
is actually -2
. Why is this?
To find out more about this you could take a look at the blog article
Why Python's integer division floors by Python's creator Guido van Rossum.
Python contains a range of comparison operators that can be applied to numbers, and have symbolic representations and meaning similar to those used in mathematics:
Comparison Relation | Meaning |
---|---|
x == y |
x is equal to y |
x != y |
x is not equal to y |
x > y |
x is greater than y |
x < y |
x is less than y |
x >= y |
x is greater than or equal to y |
x <= y |
x is less than or equal to y |
All comparison relations return a Boolean --- i.e. one of
the values True
or False
, which are of type bool
. The characteristics
and uses of bool
s will be explained below.
As well for comparing numbers, all of these operations can also be
applied to strings and various other data types. In the case of strings,
the ordering, by which the comparison is determined, is the alphabetical
ordering of the strings (i.e. the order they would appear in a
dictionary). Furthermore, when defining a new type of data as a class
,
a programmer may specify conditions for equality and ordering of instances of that class.
In most cases, comparison relations only make sense when the two values
being compared are of the same type. However, as noted above, we can
compare int
values to float
values. In this case, the int
is
considered equal to the corresponding float
.
It is essential to be aware that the equaity relation is symbolised by ==
rather than =
, since =
is the symbol used for assigning a value to a variable.
Writing x = y
where one means x == y
is a very common error in Python coding.
Luckly, such errors can normally be detected immediately by the Python
interpreter) because the
x = y
cannot occur in a place where a bool
value is required (e.g. in the
condition of an if
statement). So such a mistake will generate an error when
Python reads in the code, even before it is actually executed.
math
module¶Python provides a large range of built-in functions, which could potentially enable us to code almost any application. However, for many purposes it is convenient to be able to import
additional functions or classes that have already been implemented by other programmers. (import
allows us to stand on the
shoulders of earlier nerds.) This is particularly so for mathematical functions,
which are difficult to implement efficiently.
In Python the term module technically refers to any code that is stored in a single file. However, the term is particularly used when talking about the standard modules that are provided with Python or distributed within some package. (A package is just a colletion of modules that are grouped and distributed together).
In later units of this course, we shall be using several popular and very useful Python modules (such
as pandas
and matplotlib
), and we shall look in more detail at the import
command.
However, for the time being the following example gives a good illustration of the use of the math
module and the kinds of function it includes.
import math
print( math.pi )
print( math.cos( math.pi ))
print( math.log2(1024) )
print( math.sqrt(2) )
int
and float
Methods¶As mentioned above, in Python there is no specific differentce between
the notions of data type and class. Even the simplest types of data
can be regarded as classes and provide methods that can be applied to
data items of that type.
In practice, for most kinds
of programming the basic operators and math
module are sufficient
for most purposes, and it is rare to operate on numbers by means of methods.
However, there are some cases where we might do this.
For example, int
s have a method bit_length()
, which will return
the number of bits that will be used to store a given integer.
This can be applied to a variable whose value is an int
, using the
method syntax, as follows:
x = 123456
x.bit_length()
In the case of float
, we might want to know whether a float
value is equal to an integer value, which we can find using
the Boolean valued is_integer
method. For exampe,
5.0.is_integer()
will return True
.
We cannot apply methods directly to the decimal representation of
an int
. Code such as 123456.bit_length()
will give an error.
This may be because .
could also be interpreted as a decimal
point, so the interpreter is expecting it to be followed by a
digit. However, this issue is easily solved by use of
brackets: we can write (123456).bit_length()
,
which Python is happy to evaluate.
5.0.is_integer()
(123456).bit_length()
Like nearly all programming languages, Python provides the a type of value known as Boolean (named after the logician George Boole), which has only two possible values:
True
False
These are the only possible values of the type bool
.
As we saw above Python provides a range of comparison operators, which return Boolean values when evaluated. For example:
3 > 2
2 + 2 == 4
input_string == "yes"
"Hello"[0] == "H"
These values can be used directly, or assigned to variables, but more often
they occur in the context of test operations, which are used to
specify conditionals (control constructs using if
) and loops
(control constructs using while
or for
).
Here is a very simple example of code using a test operation within a
conditional if
construct:
if temp > 27 :
print( "I'm too hot!" )
Conditionals and loops are central to the control logic of programs, which requires choices that determine the sequence in which execution program code is exectuted based on whether certain conditions hold. Such choices are essential for the implementation of all but the most trivial algorithms. We will soon be studying control logic in detail. So the current lesson contains only a summary of the oparations that can be carried out on Booleans.
Boolean values can be manipulated by means of the Boolean operators: and, or and not:
Expression | True whenever |
---|---|
not test |
test is False |
test1 and test2 |
both of test1 and test2 are True |
test1 or test2 |
either test1 or test2 is True (or both) |
test1 == test2 |
test1 and test2 have the same value (both True or both False ) |
Since there are only two Boolean values, in all cases where operator does not
give the value True
it gives the value False
.
We can, of course, combine several Boolean operations together, which enables a complex condition to be defined in terms of combinations of simpler conditions.
Assuming that the variable wearing_coat
has been assigned a Boolean value,
we might write the following code:
if temp > 27 or (temp > 20 and wearing_coat) :
print( "I'm too hot!" )
The rules of eligibility for Python programmers' snake wrestling competition are as follows:
Pythia, the competition organiser has started coding a small Python program in order to help her determine who is eligable to contend. She has created some variables for the relevant attributes and assigned them values describin her friend Charmer. But she has got rather stuck coding the complex Boolean condition corresponding to the eligibility requiremnts of the snake wrestle. Can you help Pythia?
Modify the following code to capture the specified requirements:
name = "Charmer"
age = 30
arms = 1
wears_wig = True
married = False
eligible = ( age >= 5 and age <= 100
### Add the rest of the conditions here
### By using only variables comparisons
### and Boolean operators.
)
if eligible:
print( name, "is eligible :)")
else:
print( name, "is not eligible :(")
Strings will be considered in detail in the next Unit. In the current lesson we shall just summarise their basic characteristics and illustrate some typical operations one might carry out on strings.
Strings are sequences of character symbols. When stored or processed by a computer, each character symbol will be represented as sequence of one or more bytes, in accordance with some character encoding (most commonly ASCII or Unicode). But for present purposes, we can simply consider character symbols to be the atomic elements of strings. Many programming languages have distinct data types for characters and strings. However, in Python there is no separate data type for characters, a character is just a string of length 1. This seems to work well and avoids the need for type casing between individual characters and strings.
Within a Python program we are actually provided with several ways to represent a string. A detailed all the syntactic notations that may be used to represent strings would be very long and there is a great deal of documentation available online that your can refer to. So rather than reading the details here, it will be more useful for you to consider the following examples of how different notations can be used to create strings of different kinds. Although examples are intended to explain these notations, you should refer to Python documentation or other resources to gain a more complete understanding.
'This is a string.'
"So is this!"
"and" ' even ' "this"
"This string contains a quote symbol, but it's not the end of the string."
"Special characters like 'newlines' \n can be included by use of backslashes"
print( "When printed the \n will be converted to an actual new line" )
'''This is a long string, which goes on and on and on and on and on
and on and on and on and on and on, taking up several lines before it
finally ends.'''
"""
<H1> INFO ABOUT STRINGS IN PYTHON </H1>
<ul>
<li> The choice of two types of quotation character for demarcating
strings makes it easier to write strings that have quotation marks
within them,
like this: 'quotation within a string'
<li> Triple quoted strings provide a nice way of incorporating
chunks of formatted material into a program file.
<p>
For instance you might want to include a header or template
in an output file, such as an HTML file that you want your
program to create automatically.
<li> The use of 'escape sequences' such as '\n' and `\t' can cause
some confusion because they look like ordinary characters when
the string is represented in in a program but have a special
meaning when the string is printed.
</ul>
"""
operation | code example | value returned |
---|---|---|
+ |
'cat' + 'fish' |
'catfish' |
* |
'Hey!' * 3 |
'Hey!Hey!Hey!' |
len |
len( 'antidisestablishmentarianism' ) |
28 |
in |
'tuna' in 'fortunate' |
True |
in |
'fun' in 'programming' |
False |
Individual characters and ranges of characters (called sub-strings or slices) can be extracted using a fairly simple but flexible notation.
To refer to a character within a string we use an index number, which must be an int
.
The syntax used is illustrated by the following examples:
operation | code example | value returned | notes |
---|---|---|---|
use of a string index | "Hello"[0] |
'H' |
the index counts from 0 |
"Hello"[1] |
'e' |
||
a negative index | "Hello"[-2] |
'l' |
a negative index counts back from the end |
You can also extract a sub-string (slice) of a string by specifying a range of indices using the : character:
>>> "hello everyone"[6:11]
'every'
As was mentioned above, many useful operations on strings are provided in the form of methods. A method is essentially just a function that is attached to some particular class.
method | example code expression | value returned |
---|---|---|
startswith |
'fundamendals of programming'.startswith('fun') |
True |
endswith |
'slaughter'.endswith('laughter') |
True |
upper |
'Hello!'.upper() |
'HELLO!' |
isupper |
'Hello!'.isupper() |
False |
title |
'this is a title'.title() |
'This Is A Title' |
join |
'; '.join(['Veni', 'Vidi', 'Vici']) |
'Veni; Vidi; Vici' |
split |
'never-to-be-separated'.split('-') |
[ 'never', 'to', 'be', 'separated'] |
We shall explore these and other string methods in detail in the next unit.
None
¶Though often overlooked when condsidering the basic types of Python, the value None
(which is the only value of the type NoneType
) is useful in a variety of circumstances. In particular:
None
is most often encountered as the value that is automatically returned by
a function if it reaches the end of the specified codes without reaching
a return
statement. The keyword return
is used by programmers when defining
a function to specify what value should be returned by the function. We shall soon
see this in action, when we explore how to define and use functions.
It can be used as an initial value for a variable which later may be given a
specific value during execution of the program. Code may check the variable
and perform different actions depending on whether it has value None
or
a specific value.
It can be explicitly returned from a function by the return
command. This could
perhaps indicate that, for some reason, it was not possible to return a value of
the type normally returned by that function.
The last reason given for using None
, and possibly also the middle one, is
controversial. Some would insist that to deal with such cases one should
use other mechanisms provided by Python, in particular the
exception handling
system. We shall look at exception handling later in the course.
In this lesson we have surveyed the basic data types provided by Python and illustrated their use with small pieces of code. We have seen that, even at the level of basic data items, there is considerable variety and complexity. We shall appreciate more fully, as the course progresses, that this rich framework has been designed primarily to support the programmer in implementation of sophisticated algorithmic processes and data manipulations. The organisation of information in terms of appropriate types can greatly reduce the complexity of the actual algorithms required and is extremely helful in enabling programs to be constructed in a systematic and reliable way.