| -bench - A Benchmark for Tool-Agent-User Interaction in Real-World Domains |
| COIN |
| Compressed Chain of Thought - Efficient Reasoning Through Dense Representations |
| DeepSeek-R1 |
| Deliberative Alignment - Reasoning Enables Safer Language Models |
| G-Eval - NLG Evaluation using GPT-4 with Better Human Alignment |
| How To 100M Learning Text Video |
| How to Read a Paper |
| How To Write a Paper |
| How to Write Academic Paper (from CS Perspective) |
| Investigating Continual Pretraining in Large Language Models - Insights and Implications |
| Is a Question Decomposition Unit All We Need |
| Large Language Models are Zero-Shot Rankers for Recommender Systems |
| Molmo and PixMo |
| MultiVENT |
| OpenPI-C |
| Paper Template |
| Paper Template 1 |
| Piecing It All Together - Verifying Multi-Hop Multimodal Claims |
| Presentation Making Tips |
| PubMedQA - A Dataset for Biomedical Research Question Answering |
| Scientific Fact-Checking - A Survey of Resources and Approaches |
| Semantic Product Search for Matching Structured Product Catalogs in E-Commerce |
| Token Assorted - Mixing Latent and Text Tokens for Improved Language Model Reasoning |
| Vision Language Model-based Caption Evaluation Method Leveraging Visual Context Extraction |
| What is More Likely to Happen Next |
| Zotero Template |