Understanding the causes of non-factual hallucinations in language models and proposing effective detection methods.
LLMs exhibit biases impacting media bias detection, requiring debiasing strategies for more equitable AI systems.
LLMs' effective cutoff dates may differ from reported dates due to deduplication issues and outdated data sources.
OpenEval introduces a comprehensive evaluation platform for Chinese LLMs, focusing on capability, alignment, and safety.
Training large language models to follow instructions can lead to safety vulnerabilities, but adding safety examples during fine-tuning can significantly improve model safety without compromising performance.
Effective knowledge cutoffs in LLMs differ from reported cutoff dates due to deduplication issues and temporal misalignment of CommonCrawl dumps.
Amortized Bayesian inference using GFlowNets enables efficient sampling from intractable posteriors in large language models.
Introducing Speculative Contrastive Decoding (SCD) to enhance decoding efficiency and quality in large language models by leveraging smaller models.
The author introduces CoIN as a benchmark to evaluate Multimodal Large Language Models in sequential instruction tuning, highlighting the importance of aligning with human intent and retaining knowledge.
The author presents DOMINO, a novel decoding algorithm that achieves efficient and minimally invasive constrained generation, outperforming existing approaches with no loss in accuracy.