Update README.md

filipeoliveiraa · Jun 3, 2024 · ee03410 · ee03410
1 parent 3606904
commit ee03410
Showing 1 changed file with 15 additions and 0 deletions.
diff --git a/README.md b/README.md
@@ -7,6 +7,7 @@ At DAIR.AI we ❤️ reading ML papers so we've created this repo to highlight t
 Here is the weekly series:
 
 ## 2024
+- [Top ML Papers of the Week (May 27 - June 2)](./#top-ml-papers-of-the-week-may-27---june-2---2024)
 - [Top ML Papers of the Week (May 20 - May 26)](./#top-ml-papers-of-the-week-may-20---may-26---2024)
 - [Top ML Papers of the Week (May 13 - May 19)](./#top-ml-papers-of-the-week-may-13---may-19---2024)
 - [Top ML Papers of the Week (May 6 - May 12)](./#top-ml-papers-of-the-week-may-6---may-12---2024)
@@ -88,6 +89,20 @@ Here is the weekly series:
 
 [Join our Discord](https://discord.gg/SKgkVT8BGJ)
 
+## Top ML Papers of the Week (May 27 - June 2) - 2024
+| **Paper**  | **Links** |
+| ------------- | ------------- |
+| 1) **Contextual Position Encoding** - proposes a new position encoding method, CoPE, to enable the position to be conditioned on context by incrementing position only on certain tokens; the position encoding is context-dependent and can represent different levels of position abstraction; the general position encoding method can attend to the i-th particular word, noun, or sentence; improves perplexity on language modeling and coding tasks. | [Paper](https://arxiv.org/abs/2405.18719), [Tweet](https://x.com/jaseweston/status/1795978611784089799) |
+| 2) **Symbolic Chain-of-Thought** - proposes a method that improves the logical reasoning capabilities of LLMs by integrating symbolic expressions and logical rules with chain-of-thought (CoT) prompting; the prompting technique is called Symbolic Chain-of-Thought and it’s a fully LLM-based framework with the following key steps: 1) translates natural language context to symbolic format, 2) derives step-by-step plan to solve problems following symbolic logical rules, and 3) uses a verifier to check the translation and reasoning chain.  | [Paper](https://arxiv.org/abs/2405.18357), [Tweet](https://x.com/omarsar0/status/1795925943543898157) |
+| 3) **Abacus Embeddings** - achieves 99% accuracy on 100-digit addition problems by training on only 20-digit numbers with a single GPU; the main challenge this work addresses is the inability of transformers to track the exact position of digits; they do this by adding an embedding to each digit that encodes its position relative to the start of the number; these gains also transfer to multi-step reasoning tasks that include sorting and multiplication.  | [Paper](https://arxiv.org/abs/2405.17399), [Tweet](https://x.com/omarsar0/status/1795552696432202045) |
+| 4) **Introduction to Vision-Language Modeling** - presents an introduction to vision-language models along with key details of how they work and how to effectively train these models.   | [Paper](https://arxiv.org/abs/2405.17247), [Tweet](https://x.com/AIatMeta/status/1795499770519392499) |
+| 5) **GNN-RAG** - combines the language understanding abilities of LLMs with the reasoning abilities of GNNs in a RAG style; the GNN extracts useful and relevant graph information while the LLM takes the information and leverages its capabilities to perform question answering over knowledge graphs (KGQA); GNN-RAG improves vanilla LLMs on KGQA and outperforms or matches GPT-4 performance with a 7B tuned LLM. | [Paper](https://arxiv.org/abs/2405.20139), [Tweet](https://x.com/omarsar0/status/1796578239105679585) |
+| 6) **Attention as an RNN** - presents a new attention mechanism that can be trained in parallel (like Transformers) and be updated efficiently with new tokens requiring constant memory usage for inferences (like RNNs); the attention formulation is based on the parallel prefix scan algorithm which enables efficient computation of attention’s many-to-many RNN output; achieves comparable performance to Transformers on 38 datasets while being more time and memory-efficient.  | [Paper](https://arxiv.org/abs/2405.13956),  [Tweet](https://x.com/iScienceLuvr/status/1793933723756286075)  |
+| 7) **Aya23** - a family of multilingual language models that can serve up to 23 languages; it intentionally focuses on fewer languages and allocates more capacity to these languages; shows that it can outperform other massive multimodal models on those specific languages.  | [Paper](https://arxiv.org/abs/2405.15032),  [Tweet](https://x.com/CohereForAI/status/1794044201677574446)  |
+| 8) **Are Long-LLMs A Necessity For Long-Context Tasks?** - claims that long-LLMs are not a necessity to solve long-context tasks; proposes a reasoning framework to enable short-LLMs to address long-context tasks by adaptively accessing and utilizing the context based on the presented tasks; it decomposes the long context into short contexts and processes them using a decision-making process.  | [Paper](https://arxiv.org/abs/2405.15318),  [Tweet](https://x.com/omarsar0/status/1795188655243264299)  |
+| 9) **Financial Statement Analysis with LLMs** - claims that LLMs can generate useful insights from its analysis of trends and financial ratios; shows that GPT-4 performs on par with narrowly specialized models; and achieves a profitable trading strategy based on GPT’s predictions.  | [Paper](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4835311), [Tweet](https://x.com/omarsar0/status/1794120780428546503)  |
+| 10) **SimPO** - a simpler and more effective approach for preference optimization with a reference-free reward; uses the average log probability of a sequence as an implicit reward (i.e., no reference model required) which makes it more compute and memory efficient; demonstrates that it outperforms existing approaches like DPO and claims to produce the strongest 8B open-source model. | [Paper](https://arxiv.org/abs/2405.14734), [Tweet](https://x.com/rasbt/status/1794711330085036061) |
+
 ## Top ML Papers of the Week (May 20 - May 26) - 2024
 | **Paper**  | **Links** |
 | ------------- | ------------- |