Ed Helper#

This notebook is meant to assist with Between-Class Participation grading. To use:

  1. Download the Discussion data

    1. Go to Ed

    2. Open analytics

    3. Adjust the dates to the relevant range

    4. Download the Threads JSON

  2. Make a copy of this notebook

  3. Upload the data

  4. Adjust the filename below

  5. Run all cells in the notebook

  6. Review the student contributions at the bottom

Load data#

import json
from pathlib import Path
import pandas as pd

path = Path("..", "FILENAME.json")
data = json.load(open(path))
threads = pd.json_normalize(data)
# threads
threads.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 37 entries, 0 to 36
Data columns (total 21 columns):
 #   Column          Non-Null Count  Dtype 
---  ------          --------------  ----- 
 0   url             37 non-null     object
 1   type            37 non-null     object
 2   number          37 non-null     int64 
 3   title           37 non-null     object
 4   category        37 non-null     object
 5   subcategory     37 non-null     object
 6   subsubcategory  37 non-null     object
 7   votes           37 non-null     int64 
 8   views           37 non-null     int64 
 9   unique_views    37 non-null     int64 
 10  private         37 non-null     bool  
 11  anonymous       37 non-null     bool  
 12  endorsed        37 non-null     bool  
 13  created_at      37 non-null     object
 14  text            37 non-null     object
 15  document        37 non-null     object
 16  comments        37 non-null     object
 17  user.name       37 non-null     object
 18  user.email      37 non-null     object
 19  user.role       37 non-null     object
 20  answers         21 non-null     object
dtypes: bool(3), int64(4), object(14)
memory usage: 5.4+ KB

Include replies#

The JSON data includes reples (comments and answers) as nested under each post.

comments = pd.json_normalize(threads["comments"].explode().dropna())
# comments
replies = pd.json_normalize(threads["answers"].explode().dropna())
# replies
posts = pd.concat([threads, comments, replies]).reset_index()
# posts
posts["created_at"] = pd.to_datetime(posts["created_at"])
# posts["created_at"]

Prep output#

output = posts.copy()

# exclude the instructors
output = output[output["user.role"] != "admin"]

# sort by name
output = output.sort_values("user.name")

# only include a subset of the columns
output = output[
    [
        "user.name",
        "url",
        # "title",
        "text",
    ]
]

# make links clickable
# https://stackoverflow.com/a/20043785/358804
output["url"] = output["url"].apply(lambda url: f'<a href="{url}">Open</a>')

# render newlines
# https://stackoverflow.com/a/56881411/358804
styled = output.style.set_properties(
    **{
        "text-align": "left",
        "white-space": "pre-wrap",
    }
)

Output#

from IPython.display import HTML

HTML(styled.to_html(escape=False))