SSR Prompting: The Best Alternate to RAG on Single Large Documents

This article presents the development and findings of the SSR (Split, Summarize, Rerank) prompting technique developed for YouTubeSummarizer, a free tool developed by the team.

Note: YouTubeSummarizer is going live on 27-Oct-2023.


We are happy to introduce the SSR prompting technique designed for developing YouTubeSummarizer. YouTubeSummarizer aims to provide users with the top 10 key insights from a YouTube video based on its transcription.

Our prompting recommendation – SSR (Split-Summarize-Rerank), has shown promising results comparable to running a full-blown RAG (Retrieval Augmented Generation) on a single large document. SSR prompting technique helps you to keep the document summarization process lightweight and does not need a vector database or a library.

With the proliferation of video content on platforms like YouTube, there’s a rising need for tools to provide rapid content insights. The YouTube Summarizer, a product, aims to encapsulate the core content of YouTube videos in ten bullet points, facilitating users to grasp the essence of videos quickly. This helps the user to quickly get all key insights from the video (which may vary between 30 minutes to 3 hours) and then decide to listen/view the whole video, thus saving countless hours for the user.

The conventional approach to summarizing extensive transcriptions is the RAG (Retrieval Augmented Generation) technique. Though effective, its dependence on vector databases and heavy processing can be a bottleneck for real-time applications.


SSR Prompting: The Best Alternative to RAG on Single Large Documents

1. Transcription & Chunking: Transcribe content into text and segment into chunks. Each chunk is restricted to slightly above 3,000 words to fit within the 4k word limit of smaller LLMs.

2. Dual Pathway for Insights:

a. Direct Insights from Chunks:

• Extract about ten insights per chunk.

• Aggregate all insights generated from the chunks.

• Re-rank the set based on the insight’s usefulness, relativeness, and impact.

• Select the top 10 insights for the final bucket.

b. Summarized Insights from Chunks:

• Summarize each chunk, reducing word count by 80%.

• Merge the summaries.

• Derive ten key insights from the merged summary.

• Re-rank and select the top 10 for the final bucket.

3. Merging and Final Re-Ranking: Combine insights from both paths and perform a final re-ranking to select the top 10 insights.


Using the SSR prompting technique, we achieved results comparable to those using RAG but with improved efficiency. The methodology was tested on GPT-3.5 and GPT-4, with only marginal improvements observed with GPT-4.


• Chunk Size: 3000 words

• Model: GPT-3.5 and GPT-4

• Temperature: 0.6

• Top P: 0.8

• Frequency Penalty: 0.2

• Presence Penalty: 0.3

Try various combinations of chunk sizes, temp, top_p, and other parameters for your needs.


The SSR technique emerges as a viable alternative to RAG for applications requiring quick and lightweight solutions without compromising the quality of insights.

The YouTube Summarizer, equipped with the SSR technique, demonstrates the potential of innovative prompting techniques. It is a quicker alternative to the RAG model, thus allowing a simpler one-shot insights extraction from single large documents.

YouTubeSummarizer is going live on 27-Oct-2023.

What’s your Reaction?

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *