academic fairness transformer Transformer architecture for discrete fair division problems improving masked diffusion models Alternative GenAI architectures using RL and confidence-based masking memory layers for continual learning in LLMs efficient continual learning through optimized memory layers personal