
streaming-llm/README.md at main · mit-han-lab/streaming-llm
Deploying Large Language Models (LLMs) in streaming applications such as multi-round dialogue, where long interactions are expected, is urgently needed but poses two major …
How do you feed long texts to a model? #2 - GitHub
Oct 2, 2023 · I tried naively to add examples in https://github.com/mit-han-lab/streaming-llm/blob/main/data/mt_bench.jsonl, including examples with length of 4k tokens, without …
Is there the way of parallel prompt ? · Issue #69 · mit-han-lab ...
Nov 20, 2023 · as (run_streaming_llama.py#L61) [https://github.com/mit-han-lab/streaming-llm/blob/main/examples/run_streaming_llama.py#L61] see, prompt must be send to model one …
GitHub
{"payload":{"allShortcutsEnabled":false,"fileTree":{"":{"items":[{"name":"data","path":"data","contentType":"directory"},{"name":"examples","path":"examples","contentType":"directory"},{"name":"figures","path":"figures","contentType":"directory"},{"name":"streaming_llm","path":"streaming_llm","contentType":"directory"},{"name":".gitignore ...
streaming-llm/streaming_llm/enable_streaming_llm.py at main - GitHub
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks - mit-han-lab/streaming-llm
Can support to codellama34b? · Issue #35 · mit-han-lab ... - GitHub
Oct 10, 2023 · Guangxuan-Xiao commented on Oct 11, 2023 CodeLlamas are also Llama models, so we have already supported them :). https://huggingface.co/codellama/CodeLlama …
streaming-llm/streaming_llm/utils.py at main · mit-han-lab ... - GitHub
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks - mit-han-lab/streaming-llm
streaming-llm/data/mt_bench.jsonl at main - GitHub
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks - streaming-llm/data/mt_bench.jsonl at main · mit-han-lab/streaming-llm
code for a pytorch layer? · Issue #91 · mit-han-lab/streaming-llm
https://github.com/mit-han-lab/streaming-llm/blob/main/streaming_llm/pos_shift/modify_llama.py but if that work has already been done somewhere else, that'd be a great time saver
GitHub
{"payload":{"allShortcutsEnabled":false,"fileTree":{"":{"items":[{"name":"data","path":"data","contentType":"directory"},{"name":"figures","path":"figures","contentType":"directory"},{"name":"streaming_llm","path":"streaming_llm","contentType":"directory"},{"name":".gitignore","path":".gitignore","contentType":"file"},{"name":"LICENSE","path ...