Effective and timely feedback in educational assessments is essential but labor-intensive, especially for complex tasks. Recent developments in automated feedback systems, ranging from deterministic response grading to the evaluation of semi-open and open-ended essays, have been facilitated by advances in machine learning. The emergence of pre-trained Large Language Models, such as GPT-4, offers promising new opportunities for efficiently processing diverse response types with minimal customization. This study evaluates the effectiveness of a pre-trained GPT-4 model in grading semi-open handwritten responses in a university-level mathematics exam. Our findings indicate that GPT-4 provides surprisingly reliable and cost-effective initial grading, subject to subsequent human verification. Future research should focus on refining grading rules and enhancing the extraction of handwritten responses to further leverage these technologies.