This paper introduces Coherent Masked Diffusion (CoMD), a novel framework designed to enhance the learning of coherent and incoherent language in masked language models (MLMs). Building upon the Masked Language Diffusion (MLD) model, CoMD incorporates three key innovations: a fixed mask matrix, a coherent loss term, and a variable time parameter. These modifications aim to improve the efficiency and effectiveness of learning coherent language while maintaining computational efficiency. The fixed mask matrix ensures that the mask is independent of the token and its position, which simplifies the denoising process. The coherent loss term is designed to optimize the probability of coherent text generation without requiring additional samples per training step. The variable time parameter guides the coherent probability towards the ground truth coherent probability, further enhancing the model's performance. Empirically, CoMD outperforms previous methods on multiple coherent benchmarks, demonstrating significant speedups and parameter efficiency compared to autoregressive models. The paper provides a thorough background on MLMs, MLD, and masked diffusion language models (MDLMs), which helps readers understand the context and significance of the proposed CoMD framework. However, the paper could benefit from a more detailed discussion of the limitations of the fixed mask matrix, particularly in capturing long-range dependencies and its sensitivity to different initializations. Additionally, the paper lacks specific details on the implementation of the denoising network and the training procedures, which are crucial for reproducibility. The paper also lacks a qualitative analysis of the generated text, which would provide valuable insights into the model's performance. Despite these limitations, CoMD represents a significant step forward in the field of natural language processing, particularly in the context of generating coherent text.