Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
omarsar authored Jun 3, 2024
1 parent 3606904 commit ee03410
Showing 1 changed file with 15 additions and 0 deletions.
15 changes: 15 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ At DAIR.AI we ❤️ reading ML papers so we've created this repo to highlight t
Here is the weekly series:

## 2024
- [Top ML Papers of the Week (May 27 - June 2)](./#top-ml-papers-of-the-week-may-27---june-2---2024)
- [Top ML Papers of the Week (May 20 - May 26)](./#top-ml-papers-of-the-week-may-20---may-26---2024)
- [Top ML Papers of the Week (May 13 - May 19)](./#top-ml-papers-of-the-week-may-13---may-19---2024)
- [Top ML Papers of the Week (May 6 - May 12)](./#top-ml-papers-of-the-week-may-6---may-12---2024)
Expand Down Expand Up @@ -88,6 +89,20 @@ Here is the weekly series:

[Join our Discord](https://discord.gg/SKgkVT8BGJ)

## Top ML Papers of the Week (May 27 - June 2) - 2024
| **Paper** | **Links** |
| ------------- | ------------- |
| 1) **Contextual Position Encoding** - proposes a new position encoding method, CoPE, to enable the position to be conditioned on context by incrementing position only on certain tokens; the position encoding is context-dependent and can represent different levels of position abstraction; the general position encoding method can attend to the i-th particular word, noun, or sentence; improves perplexity on language modeling and coding tasks. | [Paper](https://arxiv.org/abs/2405.18719), [Tweet](https://x.com/jaseweston/status/1795978611784089799) |
| 2) **Symbolic Chain-of-Thought** - proposes a method that improves the logical reasoning capabilities of LLMs by integrating symbolic expressions and logical rules with chain-of-thought (CoT) prompting; the prompting technique is called Symbolic Chain-of-Thought and it’s a fully LLM-based framework with the following key steps: 1) translates natural language context to symbolic format, 2) derives step-by-step plan to solve problems following symbolic logical rules, and 3) uses a verifier to check the translation and reasoning chain. | [Paper](https://arxiv.org/abs/2405.18357), [Tweet](https://x.com/omarsar0/status/1795925943543898157) |
| 3) **Abacus Embeddings** - achieves 99% accuracy on 100-digit addition problems by training on only 20-digit numbers with a single GPU; the main challenge this work addresses is the inability of transformers to track the exact position of digits; they do this by adding an embedding to each digit that encodes its position relative to the start of the number; these gains also transfer to multi-step reasoning tasks that include sorting and multiplication. | [Paper](https://arxiv.org/abs/2405.17399), [Tweet](https://x.com/omarsar0/status/1795552696432202045) |
| 4) **Introduction to Vision-Language Modeling** - presents an introduction to vision-language models along with key details of how they work and how to effectively train these models. | [Paper](https://arxiv.org/abs/2405.17247), [Tweet](https://x.com/AIatMeta/status/1795499770519392499) |
| 5) **GNN-RAG** - combines the language understanding abilities of LLMs with the reasoning abilities of GNNs in a RAG style; the GNN extracts useful and relevant graph information while the LLM takes the information and leverages its capabilities to perform question answering over knowledge graphs (KGQA); GNN-RAG improves vanilla LLMs on KGQA and outperforms or matches GPT-4 performance with a 7B tuned LLM. | [Paper](https://arxiv.org/abs/2405.20139), [Tweet](https://x.com/omarsar0/status/1796578239105679585) |
| 6) **Attention as an RNN** - presents a new attention mechanism that can be trained in parallel (like Transformers) and be updated efficiently with new tokens requiring constant memory usage for inferences (like RNNs); the attention formulation is based on the parallel prefix scan algorithm which enables efficient computation of attention’s many-to-many RNN output; achieves comparable performance to Transformers on 38 datasets while being more time and memory-efficient. | [Paper](https://arxiv.org/abs/2405.13956), [Tweet](https://x.com/iScienceLuvr/status/1793933723756286075) |
| 7) **Aya23** - a family of multilingual language models that can serve up to 23 languages; it intentionally focuses on fewer languages and allocates more capacity to these languages; shows that it can outperform other massive multimodal models on those specific languages. | [Paper](https://arxiv.org/abs/2405.15032), [Tweet](https://x.com/CohereForAI/status/1794044201677574446) |
| 8) **Are Long-LLMs A Necessity For Long-Context Tasks?** - claims that long-LLMs are not a necessity to solve long-context tasks; proposes a reasoning framework to enable short-LLMs to address long-context tasks by adaptively accessing and utilizing the context based on the presented tasks; it decomposes the long context into short contexts and processes them using a decision-making process. | [Paper](https://arxiv.org/abs/2405.15318), [Tweet](https://x.com/omarsar0/status/1795188655243264299) |
| 9) **Financial Statement Analysis with LLMs** - claims that LLMs can generate useful insights from its analysis of trends and financial ratios; shows that GPT-4 performs on par with narrowly specialized models; and achieves a profitable trading strategy based on GPT’s predictions. | [Paper](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4835311), [Tweet](https://x.com/omarsar0/status/1794120780428546503) |
| 10) **SimPO** - a simpler and more effective approach for preference optimization with a reference-free reward; uses the average log probability of a sequence as an implicit reward (i.e., no reference model required) which makes it more compute and memory efficient; demonstrates that it outperforms existing approaches like DPO and claims to produce the strongest 8B open-source model. | [Paper](https://arxiv.org/abs/2405.14734), [Tweet](https://x.com/rasbt/status/1794711330085036061) |

## Top ML Papers of the Week (May 20 - May 26) - 2024
| **Paper** | **Links** |
| ------------- | ------------- |
Expand Down

0 comments on commit ee03410

Please sign in to comment.