Type something to search...

Creating Learner Progress Monitoring Using Python, Pandas and Streamlit

Python Pandas Streamlit

Why I Built This

In my role as Academic Program Manager for the AWS re/Start program, I was managing multiple cohorts at the same time. Each cohort had its own Canvas LMS gradebook, and each gradebook could have more than 170 columns with mixed data. Knowledge Checks, Labs, scores, completion flags, all in one file.

At first, reviewing each cohort manually was fine. But once the program grew to 10 or more cohorts per batch, doing this the manual way started to take too much time and became easy to make mistakes.

I needed something better, so I built a tool for it.


The Problem in Practice

Canvas LMS is actually a good system. But the gradebook exports it gives you are designed to be complete, not easy to read. Here is what I was dealing with every week:

  • One CSV file per cohort
  • 170+ columns per file, mixing Knowledge Checks, Labs, and other things
  • No clear separation between KC scores, Lab completion, and general course data
  • Weekly progress targets that needed to be checked for each learner

With 10 cohorts, that is 10 files and hundreds of rows to go through manually. When programs also ran in India at the same time, it was more than 40 cohorts total. Manual review was not practical anymore.


What I Built

I made a web application using Python, Pandas, and Streamlit. The idea was to keep it simple, upload the files, get the results.

The flow is:

graph LR
    A[Upload CSVs] --> B[Parse Data]
    B --> C[Aggregate KC and Lab]
    C --> D[Check Weekly Targets]
    D --> E[Download Summary]

No complicated setup. The people using it are program coordinators, not engineers, so the interface had to be straightforward.

What the app does:

  • Accept multiple CSV uploads at once (one per cohort)
  • Automatically find and parse Knowledge Check columns
  • Automatically find and parse Lab completion columns
  • Calculate each learner’s progress against the weekly targets
  • Output a clean, downloadable summary per cohort

Tech Stack

I picked tools that I knew well and that would be fast to build and easy to maintain later.

  • Python for the core logic
  • Pandas for handling 170+ column DataFrames without it getting messy
  • Streamlit because building a working web UI in Python takes hours, not days
  • PythonAnywhere for hosting, so anyone can access it without installing anything

This is not the most sophisticated stack, but it was the right one for this problem. The tool needed to run reliably and be usable by non-technical team members. This stack delivered that.


One Extra Thing: The Leaderboard

After the main tool was working, I added a learner-facing progress leaderboard using Google Sheets.

The idea was simple. Learners could see their own KC and Lab scores compared to others in the cohort. This did two things:

  1. It reduced the number of reminder messages coordinators had to send
  2. It gave learners a way to monitor themselves without waiting for someone to tell them how they were doing

It was a small addition but it made the experience better for both sides.


Impact

The tool ended up being used across multiple batches:

  • 3 AWS re/Start batches total
  • 10+ cohorts in Indonesia
  • 40+ cohorts in India

The time to process each batch dropped a lot compared to doing it manually. Coordinators could spend that saved time on actual learner support instead of spreadsheet work.


What I Would Change Now

Looking back at this, there are things I would do differently:

Automated gradebook retrieval. Right now, someone has to download the files from Canvas and upload them. It would be better to pull directly from the Canvas API.

Built-in charts. The output is a clean CSV, but visualizing trends over time still requires extra work. Putting basic charts directly in the app would help a lot.

Timeline tracking. Right now the tool shows current state. Tracking progress over weeks would make it easier to see patterns early.

But for what it was solving at the time, it worked well.

Share:

Related Posts

Article

Building a Scalable ETL Pipeline with AWS Glue (CSV to Parquet + Partitioning)

AWS Glue ETL

A hands-on walkthrough of building a serverless ETL pipeline with AWS Glue, PySpark, and Amazon Athena: converting raw CSV files to partitioned Parquet for efficient querying at scale.

6 min read 08 Apr, 2026
Read