Brackets in Python and pandas#
A common source of confusion for those who are new to Python and pandas are the uses of different types of brackets. Hopefully this guide can clarify them for you.
Curly braces#
{
and }
Curly braces are used to define Python dictionaries (dicts).
my_dict = {
"A": 1,
"B": 2,
"C": 3,
}
type(my_dict)
dict
Square brackets#
[
and ]
Lists#
Python uses square brackets to define lists.
my_list = ["A", "B", "C"]
type(my_list)
list
They are used to retrieve an element, by passing in an index.
my_list[1]
'B'
…or a slice:
my_list[1:]
['B', 'C']
Dictionaries#
When you have a Python dict, you retrieve values by passing the key between square brackets.
my_dict["B"]
2
DataFrames#
Let’s create a simple Dataframe we can use for demonstration purposes. To do so, we pass a dictionary with lists of values for each column to the pd.DataFrame
constructor.
import pandas as pd
# create a simple DataFrame
df = pd.DataFrame(
{
"col1": ["A", "B", "C"],
"col2": [3, 4, 5],
"col3": [6.32, 8.1, 4.9],
}
)
df
col1 | col2 | col3 | |
---|---|---|---|
0 | A | 3 | 6.32 |
1 | B | 4 | 8.10 |
2 | C | 5 | 4.90 |
Single column#
To retrieve a single column, take the DataFrame variable followed by square brackets, passing in the name (label) of a column as a string.
df["col1"]
0 A
1 B
2 C
Name: col1, dtype: object
type(df["col1"])
pandas.core.series.Series
Multiple columns#
If you pass multiple column names as a list, it will return a new DataFrame with just those columns in that order.
columns = ["col3", "col2"]
type(columns)
list
df[columns]
col3 | col2 | |
---|---|---|
0 | 6.32 | 3 |
1 | 8.10 | 4 |
2 | 4.90 | 5 |
type(df[columns])
pandas.core.frame.DataFrame
You’ll sometimes see that written all in one line. In this case, the inner and outer pairs of square brackets are serving different purposes.
df[["col3", "col2"]]
col3 | col2 | |
---|---|---|
0 | 6.32 | 3 |
1 | 8.10 | 4 |
2 | 4.90 | 5 |
Boolean indexing#
See Lecture 2. In this case, we’re passing in a Series of True
s and False
s between square brackets, which tells it which rows to select.
Parentheses#
(
and )
Parentheses serve a number of purposes in Python. Because pandas is a Python package, it uses them too.
Tuples#
Python has a type called a tuple, which is like a list that can’t be modified.
my_tuple = (1, 2, 3)
type(my_tuple)
tuple
Logical grouping#
You can use parentheses to control the order of operations.
(1 + 1) / 6
0.3333333333333333
1 + (1 / 6)
1.1666666666666667
Multi-line statements#
You can wrap Python statements in parentheses to split them into multiple lines. This also allows you to embed comments.
60 * 24 * 365
525600
can be rewritten as:
minutes_in_year = (
# minutes per hour
60
# hours per day
* 24
# days per year
* 365
)
minutes_in_year
525600
Functions#
In Python, parentheses are used in function definitions to specify the arguments.
def add_five(num):
return num + 5
Then, parentheses are used to call functions, passing in the arguments (if any).
add_five
<function __main__.add_five(num)>
add_five(6)
11
Classes#
When making a new instance of a class, you use parentheses after the class name. We saw this above with pd.DataFrame()
.
Angle brackets#
<
and >
Angle brackets are used to do comparison.
6 > 5
True
pandas#
In pandas, that comparison can be done across all values in a Series, returning a new Series with the results.
df["col3"]
0 6.32
1 8.10
2 4.90
Name: col3, dtype: float64
type(df["col3"])
pandas.core.series.Series
df["col3"] < 7
0 True
1 False
2 True
Name: col3, dtype: bool
type(df["col3"] < 7)
pandas.core.series.Series
You’ll often see this used in boolean indexing.
Conclusion#
Totally reaonable to be confused about which brackets mean what in what contexts. Be patient, it just takes time for it to sink in.
See also: How do I select a subset of a DataFrame
?