Introducing Thaislate: AI-Powered Grammar Learning

NLP September 20258 min read
Thaislate Thai Interface

Abstract: Thai learners struggle with English tenses because Thai lacks grammatical tense markers. Existing translation tools provide corrections without explanations, leaving learners unable to understand their errors. This dissertation develops Thaislate, a proof-of-concept system demonstrating how large language models can bridge this gap through context-aware grammar explanations.

The Challenge: When Languages Don't Align

Imagine a Thai university student named Somchai working on his thesis proposal in English. He writes "I am study at university for three years" and knows something sounds wrong, but cannot explain why. His translation app suggests "I have been studying at university for three years," but offers no explanation of why the present perfect continuous tense is appropriate here. Frustrated, Somchai memorises the correction without understanding the underlying temporal logic, leaving him likely to repeat similar errors.

This scenario illustrates a widespread challenge: Thai learners struggle with English tense usage because Thai is a "tenseless language" that relies on contextual cues rather than grammatical markers to convey time relationships. Current educational tools compound this problem. Translation systems provide corrections without explanation, while grammar instruction emphasises rote memorisation over contextual understanding.

Central Research Question

Can AI-powered tools help Thai learners understand English tense usage through contextual, automatically-generated explanations?

The Solution: Thaislate

This question drives the development of Thaislate, a web-based tool that helps Thai learners understand English tenses through explanations in their native language. When a user types a Thai sentence like "chan gin khao chao laew" (I ate breakfast already), the system not only translates it to "I have eaten breakfast" but also explains why the present perfect tense is used instead of past simple.

The explanation clarifies that the Thai word "laew" (already) indicates a completed action with present relevance, which corresponds to English present perfect usage. The tool acts like a knowledgeable tutor who understands both languages and can bridge the conceptual gap between Thai and English temporal systems, transforming simple translation interactions into learning opportunities.

Thaislate English Interface

Research Aims and Objectives

This research develops Thaislate as a proof-of-concept system that demonstrates how large language models can be integrated to provide grammar-aware translation with educational explanations. The research validates the technical approach and user acceptance, establishing the foundation for future studies on educational effectiveness.

Five Key Objectives

  1. Build the Core System: To implement a three-component pipeline integrating Thai-English translation, tense identification, and educational explanation generation. This architecture was discovered through iterative testing when monolithic approaches failed to achieve acceptable accuracy, particularly in tense classification.
  2. Develop Tense Recognition System: To create a hierarchical classification system capable of identifying not only broad tense categories (Past, Present, Future) but also detailed temporal distinctions (24 specific uses such as "completed action with present relevance" or "scheduled future events").
  3. Create an Intuitive Learning Interface: To develop a user-friendly web application where Thai learners can input sentences in Thai and receive accurate English translations accompanied by clear explanations that bridge Thai and English temporal concepts.
  4. Validate System Design and User Acceptance: To assess both the technical accuracy of the system and user acceptance through user studies with Thai English learners, validating that the tool addresses identified needs.
  5. Contribute to Educational AI Research: To contribute new insights into how artificial intelligence systems can be designed for educational applications in contexts where linguistic resources are limited.

Technical Architecture Overview

The system integrates three specialised models in a pipeline architecture:

  • Typhoon Translate 4B for Thai-English translation
  • Custom-trained XLM-RoBERTa achieving 94.7% accuracy on 24-category tense classification
  • Typhoon 2.1 12B generating educational explanations
Thaislate Complete Architecture

Validation and Results

User testing with 38 Thai learners produced 474 ratings averaging 4.2/5, with explanation quality rated 4.33/5 despite 74% pipeline tense classification accuracy. The disconnect between technical performance and user satisfaction validates the approach: learners value clear explanations even when imperfect.

The system successfully serves real users with 99.5% uptime, establishing technical feasibility and user acceptance as foundation for future longitudinal studies on learning effectiveness.

Research Journey Overview

Literature Review

Examines existing research in machine translation for Thai-English language pairs, grammatical tense classification systems, and computer-assisted language learning tools.

User Study and Requirements Analysis

Presents empirical research with 218 Thai English learners to understand their specific challenges and preferences for grammar learning tools.

Design and Implementation - Core Models

Focuses on the development of the three specialised AI models that form the system's educational core, from initial experimentation to final architecture.

Design and Implementation - Pipeline and Website

Describes how individual models integrate into a complete educational system, including web application architecture and production deployment.

Evaluation and Results

Presents comprehensive evaluation across technical performance, pipeline integration accuracy, and user acceptance validation.

Conclusion and Future Work

Summarises key achievements, reflects on challenges encountered, and proposes future research directions in educational NLP.

Source Code