Ed Helper

Contents

Ed Helper#

This notebook is meant to assist with Between-Class Participation grading. To use:

Download the Discussion data
1. Go to Ed
2. Open analytics
3. Adjust the dates to the relevant range
4. Download the Threads JSON
Make a copy of this notebook
Upload the data
Adjust the filename below
Run all cells in the notebook
Review the student contributions at the bottom

Load data#

import json
from pathlib import Path
import pandas as pd

path = Path("..", "FILENAME.json")
data = json.load(open(path))
threads = pd.json_normalize(data)
# threads

threads.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 37 entries, 0 to 36
Data columns (total 21 columns):
 #   Column          Non-Null Count  Dtype 
---  ------          --------------  ----- 
 url             37 non-null     object
 type            37 non-null     object
 number          37 non-null     int64 
 title           37 non-null     object
 category        37 non-null     object
 subcategory     37 non-null     object
 subsubcategory  37 non-null     object
 votes           37 non-null     int64 
 views           37 non-null     int64 
 unique_views    37 non-null     int64 
private         37 non-null     bool  
anonymous       37 non-null     bool  
endorsed        37 non-null     bool  
created_at      37 non-null     object
text            37 non-null     object
document        37 non-null     object
comments        37 non-null     object
user.name       37 non-null     object
user.email      37 non-null     object
user.role       37 non-null     object
answers         21 non-null     object
dtypes: bool(3), int64(4), object(14)
memory usage: 5.4+ KB

Include replies#

The JSON data includes reples (comments and answers) as nested under each post.

comments = pd.json_normalize(threads["comments"].explode().dropna())
# comments

replies = pd.json_normalize(threads["answers"].explode().dropna())
# replies

posts = pd.concat([threads, comments, replies]).reset_index()
# posts

posts["created_at"] = pd.to_datetime(posts["created_at"])
# posts["created_at"]

Prep output#

output = posts.copy()

# exclude the instructors
output = output[output["user.role"] != "admin"]

# sort by name
output = output.sort_values("user.name")

# only include a subset of the columns
output = output[
    [
        "user.name",
        "url",
        # "title",
        "text",
    ]
]

# make links clickable
# https://stackoverflow.com/a/20043785/358804
output["url"] = output["url"].apply(lambda url: f'<a href="{url}">Open</a>')

# render newlines
# https://stackoverflow.com/a/56881411/358804
styled = output.style.set_properties(
    **{
        "text-align": "left",
        "white-space": "pre-wrap",
    }
)

Output#

from IPython.display import HTML

HTML(styled.to_html(escape=False))