canvas-mcp

Example: Bulk Grading Jupyter Notebooks

This example demonstrates the 99.7% token savings achieved by using the code execution API for bulk grading operations.

Scenario

Grade 90 Jupyter notebook submissions for Assignment 123.

Traditional Approach (1.35M tokens ❌)

The Problem

// Load ALL submissions into context
const submissions = await list_submissions({
  courseIdentifier: "60366",
  assignmentId: "123"
});
// → 90 submissions × 15K tokens each = 1.35M tokens!

// Process each one (more tokens!)
for (const sub of submissions) {
  // Each submission's full data is in Claude's context
  const notebook = findNotebook(sub);
  const analysis = analyzeNotebook(notebook);

  await grade_with_rubric({
    courseIdentifier: "60366",
    assignmentId: "123",
    userId: sub.userId,
    rubricAssessment: { ... },
    comment: analysis.comment
  });
}

Why This Is Inefficient

Code Execution Approach (3.5K tokens ✅)

The Solution

import { bulkGrade } from './canvas/grading/bulkGrade';

await bulkGrade({
  courseIdentifier: "60366",
  assignmentId: "123",
  gradingFunction: (submission) => {
    // ⭐ This function runs LOCALLY in execution environment
    // ⭐ Submissions never enter Claude's context!

    const notebook = submission.attachments?.find(
      f => f.filename.endsWith('.ipynb')
    );

    if (!notebook) {
      console.log(`No notebook for user ${submission.userId}`);
      return null; // Skip this submission
    }

    // Download and analyze notebook (locally!)
    const analysis = analyzeNotebook(notebook.url);

    if (analysis.hasErrors) {
      return {
        points: 0,
        rubricAssessment: {
          "_8027": {
            points: 0,
            comments: `Found ${analysis.errors.length} errors: ${analysis.errors.join(', ')}`
          }
        },
        comment: "Please fix errors and resubmit. See rubric for details."
      };
    }

    // No errors - full points!
    return {
      points: 100,
      rubricAssessment: {
        "_8027": {
          points: 100,
          comments: "Excellent work! All cells executed successfully."
        }
      },
      comment: "Great submission! Notebook runs perfectly without errors."
    };
  }
});

What You See (Output)

Starting bulk grading for assignment 123...
Found 90 submissions to process

✓ Graded submission for user 12345
✓ Graded submission for user 12346
Skipped submission for user 12347 (no notebook)
✓ Graded submission for user 12348
✗ Failed to grade user 12349: Network timeout
...

Bulk grading complete:
  Total: 90
  Graded: 87
  Skipped: 2
  Failed: 1

First 5 results:
  - User 12345: ✓ Success
  - User 12346: ✓ Success
  - User 12347: Skipped
  - User 12348: ✓ Success
  - User 12349: ✗ Failed

Why This Is Efficient

Token Comparison

Metric Traditional Code Execution Savings
Token Usage 1.35M 3.5K 99.7%
Data Location Claude’s context Execution environment Local
Processing Speed Slow (sequential) Fast (concurrent) 10x+
Max Submissions ~100 (token limits) Unlimited
Cost (approximate) High Minimal ~$0.02 vs ~$5

Advanced Example: Custom Analysis

You can implement any grading logic you want:

await bulkGrade({
  courseIdentifier: "60366",
  assignmentId: "123",
  gradingFunction: (submission) => {
    const notebook = submission.attachments?.find(
      f => f.filename.endsWith('.ipynb')
    );

    if (!notebook) return null;

    // Custom analysis logic
    const analysis = {
      cellCount: countCells(notebook),
      hasDocstrings: checkDocstrings(notebook),
      passesTests: runTests(notebook),
      codeQuality: analyzeCodeQuality(notebook)
    };

    // Complex grading rubric
    let points = 0;
    const rubricComments: Record<string, any> = {};

    // Criterion 1: Functionality (50 points)
    if (analysis.passesTests) {
      points += 50;
      rubricComments["_8027"] = {
        points: 50,
        comments: "All tests pass! ✓"
      };
    } else {
      rubricComments["_8027"] = {
        points: 0,
        comments: "Some tests failed. See notebook for details."
      };
    }

    // Criterion 2: Documentation (30 points)
    const docPoints = analysis.hasDocstrings ? 30 : 15;
    points += docPoints;
    rubricComments["_8028"] = {
      points: docPoints,
      comments: analysis.hasDocstrings
        ? "Excellent documentation!"
        : "Add more docstrings to improve documentation."
    };

    // Criterion 3: Code Quality (20 points)
    const qualityPoints = Math.min(20, analysis.codeQuality * 20);
    points += qualityPoints;
    rubricComments["_8029"] = {
      points: qualityPoints,
      comments: `Code quality score: ${analysis.codeQuality * 100}%`
    };

    return {
      points,
      rubricAssessment: rubricComments,
      comment: `Total: ${points}/100. Great work on ${
        analysis.passesTests ? 'passing all tests' : 'your effort'
      }!`
    };
  }
});

Dry Run Mode (Testing)

Test your grading logic without actually grading:

await bulkGrade({
  courseIdentifier: "60366",
  assignmentId: "123",
  dryRun: true,  // ⭐ Test mode - doesn't actually grade
  gradingFunction: (submission) => {
    // Your grading logic here
    console.log(`Would grade: ${submission.userId}`);
    return { points: 100, ... };
  }
});

Best Practices

  1. Always test with dry run first before grading for real
  2. Handle errors gracefully - return null to skip problematic submissions
  3. Provide detailed rubric comments to help students understand their grades
  4. Log progress using console.log() to track grading status
  5. Validate rubric criterion IDs before grading (use list_assignment_rubrics)

Common Rubric Criterion ID Patterns

Canvas rubric criterion IDs typically start with underscore:

To find the correct IDs for your rubric:

// First, discover the rubric structure
const rubric = await search_canvas_tools("list_assignment_rubrics", "full");

// Then use the correct criterion IDs in bulkGrade

Troubleshooting

“No exported function found”

“Criterion ID not found”

“Rate limit exceeded”

“Submission not found”

Summary

The code execution API transforms bulk grading from a token-intensive operation into an efficient, scalable workflow:

This pattern works for any bulk operation:

Result: 99.7% token savings + faster execution + better scalability 🎉