Chapter 7: Tutorial Content Generation#
Welcome back! We've come a long way. We've seen the Web Interface (Frontend) where you start the process, how the backend is set up using Serverless Deployment (Azure Functions), and how a Workflow Engine (Pocket Flow) orchestrates all the steps. We've also covered the initial steps of getting the raw code via Code Fetching and using AI Understanding to analyze it and figure out the main concepts (Abstractions) and how they relate. Plus, you know how our project talks to the AI Brain using LLM Communication.
Now, we have all the ingredients: * The raw code content. * A list of the main concepts in the code (Abstractions). * An understanding of how those concepts connect (Relationships). * A planned order for the tutorial chapters.
But we don't have the actual tutorial text! We need to turn these ingredients into readable, beginner-friendly explanations, code examples, and diagrams that someone new to the codebase can understand.
This is the core task of Tutorial Content Generation.
What is Tutorial Content Generation?#
Tutorial Content Generation is the step where the project writes the actual chapters of the tutorial. It takes the structured information gathered in the previous steps β especially the identified abstractions, their descriptions, relevant code snippets, and the determined chapter order β and uses the AI to craft the narrative.
Think of it like writing a book based on a detailed outline and research notes. You have the structure (chapter order) and the topics/facts for each chapter (abstractions, relationships, code). Now, you need to write the prose, explaining things clearly, providing examples, and making sure the chapters flow logically from one to the next.
The Problem: Writing a Tutorial is Hard!#
Even with a clear understanding of the codebase's concepts and structure, writing a good tutorial is a skill. You need to:
- Explain complex technical ideas in simple terms.
- Use analogies that make sense to beginners.
- Select and present relevant code snippets clearly.
- Break down complex processes into step-by-step instructions.
- Ensure a smooth flow between different topics and chapters.
- Add diagrams and visualizations where helpful.
Doing this manually for an entire codebase is a massive effort. It requires deep technical understanding and excellent communication skills, specifically tailored for beginners.
The Solution: AI as Your Tutorial Author#
Our project leverages the power of the AI (the LLM) once again, this time to act as the tutorial author. By giving the AI the structured analysis results and very specific instructions (via a detailed prompt), we can automate the process of writing each chapter.
This is handled primarily by one specific Pocket Flow Node: the WriteChapters
node.
How It Works: The WriteChapters
Node#
In the Pocket Flow workflow (Chapter 3), the WriteChapters
node sits after the analysis and ordering steps. Its job is to go through the planned chapter order and write the content for each chapter.
Let's look at the process orchestrated by this node:
-
Batch Processing (
BatchNode
):WriteChapters
is implemented as aBatchNode
. This is useful because the task of writing one chapter is largely independent of writing another (though we do provide context about previous chapters, as we'll see). ABatchNode
prepares a list of items to process, and then theexec
method is called for each item in that list. In this case, each item represents the job of writing one tutorial chapter. -
prep
Phase: Prepare the Chapters:- The
prep
method ofWriteChapters
reads thechapter_order
(the list of abstraction indices determined inOrderChapters
), theabstractions
list (with names, descriptions, and file indices fromIdentifyAbstractions
), and the originalfiles
data from theshared
store. - It then creates a list of data objects, one for each planned chapter. Each object contains all the information the AI will need to write that specific chapter. This includes:
- The chapter number.
- The details (name, description, relevant file indices) of the abstraction that chapter is about.
- The content of the relevant code files identified for this abstraction.
- The full list of all chapters with their planned names and filenames (this is crucial for the AI to create links between chapters).
- Information about the previous and next chapters in the sequence (to help the AI write transitions).
- The project name and language setting.
- This list of chapter data objects is what the
BatchNode
will process, callingexec
for each one.
ThesequenceDiagram participant Shared as Shared Store participant WriteNode as WriteChapters Node WriteNode->>Shared: Read "chapter_order"<br>"abstractions"<br>"files"<br>"project_name"<br>"language" WriteNode->>WriteNode: prep(shared) Note over WriteNode: Creates list of chapter job items<br>(1 item per planned chapter)<br>Each item has context for ONE chapter<br>(Abstraction, Code, Full Chapter List, etc.) WriteNode-->>WriteNode: Return list of chapter items
prep
method gathers inputs and prepares a list of tasks, one for each chapter. - The
-
exec
Phase: Write One Chapter:- The
exec
method is called by theBatchNode
framework for each chapter data object prepared byprep
. It receives the data for one specific chapter. - Inside
exec
, the code performs the most critical task: it builds a very detailed prompt for the AI. This prompt is like a set of instructions and context specifically for writing that single chapter. - The prompt includes:
- The chapter's number, the project name, and the specific concept (Abstraction name and description) it should explain.
- The content of the relevant code snippets for this concept.
- The complete list of all planned chapters including their calculated filenames (e.g., "1. Web Interface (Frontend)"). This is essential so the AI knows the titles and filenames when instructed to create links to other chapters.
- A summary of the content generated for the previous chapters in this batch run. This is stored in a temporary instance variable (
self.chapters_written_so_far
) that theBatchNode
instance keeps track of. Including this allows the AI to build upon what was just explained, creating a more coherent tutorial flow and writing proper transitions. - Detailed instructions on how to write the chapter:
- Use a beginner-friendly tone.
- Start with a high-level problem/use case.
- Break down complex ideas.
- Explain how to use the concept.
- Provide simple, short code examples (under 20 lines!) and explain them.
- Describe the internal implementation, perhaps with a simple sequence diagram.
- Crucially, use Markdown links to reference other chapters, looking up the correct title and filename from the provided full chapter list.
- Use analogies and Mermaid diagrams.
- Write a conclusion and transition to the next chapter, again using a Markdown link if applicable.
- Specify the output format should only be the Markdown text for that chapter.
- Include language instructions if the target language is not English.
- The
exec
method then calls thecall_llm
utility (Chapter 6) with this carefully crafted prompt. - It receives the raw Markdown text response from the AI.
- It performs basic validation or cleanup (like ensuring the chapter starts with the correct Markdown heading).
- It adds the generated Markdown content for this chapter to the temporary list (
self.chapters_written_so_far
). This content will then be available as "Context from previous chapters" when theexec
method runs for the next chapter in the batch. - It returns the generated Markdown content for this specific chapter.
ThesequenceDiagram participant WriteNode as WriteChapters Node<br>(exec method, running for 1 chapter) participant LLM_Util as call_llm Utility participant LLM as Large Language Model (AI) WriteNode->>WriteNode: Get current chapter's item data<br>Get self.chapters_written_so_far<br>Build detailed prompt<br>(Concept, Code, Full Structure, Previous Summary, Instructions) WriteNode->>LLM_Util: call_llm(prompt) LLM_Util->>LLM: Send prompt for ONE chapter LLM-->>LLM_Util: Return raw Markdown text LLM_Util-->>WriteNode: Return text WriteNode->>WriteNode: Validate/Cleanup text<br>Add text to self.chapters_written_so_far WriteNode-->>WriteNode: Return generated chapter text
exec
method runs for each chapter item, builds a detailed prompt including context from previous chapters, calls the AI, gets the content, and updates the context for the next chapter. - The
-
post
Phase: Collect All Chapters:- After the
exec
method has successfully run for all the chapter items prepared byprep
, thepost
method ofWriteChapters
is called. - The
BatchNode
framework automatically collects all the return values from everyexec
call and passes them topost
as a list (exec_res_list
). This list contains the generated Markdown content for every single chapter, in the order they were processed. - The
post
method takes this list of chapter contents and stores it in theshared
dictionary under the key"chapters"
. - It also cleans up the temporary
self.chapters_written_so_far
list.
ThesequenceDiagram participant WriteNode as WriteChapters Node participant Shared as Shared Store WriteNode->>WriteNode: post(shared, ..., exec_res_list)<br>exec_res_list contains ALL chapter contents WriteNode->>Shared: Write "chapters"<br>(List of all chapter Markdown contents) WriteNode->>WriteNode: Clean up self.chapters_written_so_far
post
method is called once after all chapters are written, collecting the results and storing them in the shared store. - After the
By using a BatchNode
and carefully managing the context passed to the AI (the full structure for linking and the cumulative summary of previous chapters), the WriteChapters
node efficiently generates the complete tutorial content, chapter by chapter.
Looking at the Code (function_app/nodes.py
)#
Let's look at simplified snippets from the WriteChapters
class in function_app/nodes.py
to see these steps reflected in code.
# function_app/nodes.py (Simplified WriteChapters)
from pocketflow import BatchNode # Note: It's a BatchNode!
from utils.call_llm import call_llm # We use the LLM communication utility
import os # For path joining, used in filename generation
class WriteChapters(BatchNode):
def prep(self, shared):
chapter_order = shared["chapter_order"] # List of indices for chapter order
abstractions = shared["abstractions"] # List of identified concepts
files_data = shared["files"] # Original file content
language = shared.get("language", "english") # Language setting
self.chapters_written_so_far = [] # Temporary storage for previous chapters summary
# --- Prepare full chapter list and filenames for linking ---
all_chapters = []
chapter_filenames = {} # Map abstraction index to filename
for i, abstraction_index in enumerate(chapter_order):
if 0 <= abstraction_index < len(abstractions):
chapter_num = i + 1
chapter_name = abstractions[abstraction_index]["name"] # Get concept name
# Create filename from concept name (e.g., "01_web_interface.md")
safe_name = "".join(c if c.isalnum() else '_' for c in chapter_name).lower()
filename = f"{i+1:02d}_{safe_name}.md"
# Store for linking: "[Chapter Title](filename.md)"
all_chapters.append(f"{chapter_num}. [{chapter_name}]({filename})")
# Store mapping for easy lookup in exec
chapter_filenames[abstraction_index] = {"num": chapter_num, "name": chapter_name, "filename": filename}
full_chapter_listing = "\n".join(all_chapters) # This goes into the prompt
# --- Create list of items for the batch process (1 item per chapter) ---
items_to_process = []
for i, abstraction_index in enumerate(chapter_order):
if 0 <= abstraction_index < len(abstractions):
abstraction_details = abstractions[abstraction_index] # Concept details for this chapter
related_file_indices = abstraction_details.get("files", []) # Indices of relevant files
# Get content for relevant files using helper function
related_files_content_map = get_content_for_indices(files_data, related_file_indices)
# Prepare data for this specific chapter's item
items_to_process.append({
"chapter_num": i + 1,
"abstraction_index": abstraction_index,
"abstraction_details": abstraction_details,
"related_files_content_map": related_files_content_map,
"project_name": shared["project_name"],
"full_chapter_listing": full_chapter_listing, # Needed for linking
"chapter_filenames": chapter_filenames, # Needed for linking
"language": language,
# Note: 'previous_chapters_summary' is NOT added here, it's generated in exec
})
# else: handle invalid index... (omitted for brevity)
print(f"Preparing to write {len(items_to_process)} chapters...")
return items_to_process # Return the list of items
def exec(self, item):
# This runs for EACH item (chapter)
abstraction_name = item["abstraction_details"]["name"]
abstraction_description = item["abstraction_details"]["description"]
chapter_num = item["chapter_num"]
project_name = item.get("project_name")
language = item.get("language", "english")
print(f"Writing chapter {chapter_num} for: {abstraction_name} using LLM...")
# Get the content generated for previous chapters (stored in the instance variable)
# This provides context for writing transitions
previous_chapters_summary = "\n---\n".join(self.chapters_written_so_far)
# Format relevant code snippets for the prompt
file_context_str = "\n\n".join(
f"--- File: {idx_path.split('# ')[1] if '# ' in idx_path else idx_path} ---\n{content}"
for idx_path, content in item["related_files_content_map"].items()
)
# --- Build the detailed prompt for the AI (Simplified) ---
# The prompt is quite long and contains all the instructions mentioned earlier.
# It includes:
# - Language instructions (if not English)
# - The chapter heading structure
# - The concept details (name, description)
# - The FULL chapter listing for linking
# - The summary of previously written chapters (previous_chapters_summary)
# - The relevant code snippets (file_context_str)
# - Detailed instructions on content, tone, code limits, diagrams, cross-linking, etc.
# The actual prompt is complex, but the key is it combines ALL necessary context and instructions.
prompt = f"""
Write a very beginner-friendly tutorial chapter (in Markdown format) for the project `{project_name}` about the concept: "{abstraction_name}". This is Chapter {chapter_num}.
... (Rest of the detailed instructions and context like the actual prompt in the file) ...
Context from previous chapters:
{previous_chapters_summary if previous_chapters_summary else "This is the first chapter."}
Relevant Code Snippets (Code itself remains unchanged):
{file_context_str if file_context_str else "No specific code snippets provided for this abstraction."}
... (More instructions on format, linking, diagrams, etc.) ...
Output *only* the Markdown content for this chapter.
""" # This is a highly simplified representation of the full prompt string
chapter_content = call_llm(prompt) # Call the AI!
# Basic validation/cleanup (like checking/adding the heading)
actual_heading = f"# Chapter {chapter_num}: {abstraction_name}"
if not chapter_content.strip().startswith(actual_heading):
# Add heading if missing or incorrect... (simplified)
chapter_content = f"{actual_heading}\n\n{chapter_content}"
# Add the generated content to the temporary list for the NEXT chapter's context
self.chapters_written_so_far.append(chapter_content)
return chapter_content # Return the content for THIS chapter
def post(self, shared, prep_res, exec_res_list):
# This runs AFTER all exec calls are complete
# exec_res_list is a list containing the return value of each exec call (the chapter content)
shared["chapters"] = exec_res_list # Store the final list of chapter contents
# Clean up the temporary storage
del self.chapters_written_so_far
print(f"Finished writing {len(exec_res_list)} chapters.")
# Helper function defined elsewhere in nodes.py (used by prep)
# def get_content_for_indices(files_data, indices): ...
This code shows how prep
sets up the list of chapter jobs. The exec
method is the core, building a rich prompt for each chapter, calling call_llm
, and crucially adding the result to self.chapters_written_so_far
to build context for subsequent exec
calls in the batch. Finally, post
collects all the results from the batch and saves them.
Key Aspects for Beginner-Friendly Content#
The intelligence for making the content beginner-friendly doesn't just come from the AI itself, but from the instructions given in the prompt built within the exec
method. These instructions guide the AI to:
- Use simple language and analogies.
- Focus on use cases and practical explanations.
- Keep code examples short and provide explanations after them.
- Break down complex ideas.
- Suggest using visual aids like Mermaid diagrams.
- Ensure proper linking between chapters using the provided structure.
- Maintain a welcoming tone.
These detailed instructions are what translates the raw analysis into a structured, easy-to-follow tutorial aimed at someone new to the codebase.
Benefits of this Approach#
- Automation: Automatically generates entire tutorial chapters, saving huge amounts of manual writing time.
- Consistency: Follows a defined structure and tone guided by the prompt instructions.
- Leverages Analysis: Directly uses the output of code analysis and ordering, ensuring the tutorial is based on the actual codebase structure.
- Contextual Flow: Provides context from previous chapters to the AI, helping create smoother transitions and a more connected narrative.
- Scalability: A
BatchNode
can potentially process multiple chapters in parallel (though our current implementation makes it sequential within the batch for context building), which can speed up the process for larger projects.
Conclusion#
Tutorial Content Generation is where the pieces come together! Leveraging the power of AI through the WriteChapters
node, the project takes the structured understanding of the codebase (Abstractions, Relationships, Order) and transforms it into the actual Markdown text for each tutorial chapter. By providing a detailed prompt with all necessary context and specific instructions for beginner-friendly writing, the AI generates the content, including explanations, code snippets, diagrams, and crucial links between chapters.
With the chapters written, the final step is to collect all these generated files and make them available. That's what we'll cover in the next chapter.
Next Chapter: Output Management
Generated by AI Codebase Knowledge Builder. References: 1(https://github.com/hieuminh65/Tutorial-Codebase-Knowledge/blob/be7f595a38221b3dd7b1585dc226e47c815dec6e/function_app/nodes.py), 2(https://github.com/hieuminh65/Tutorial-Codebase-Knowledge/blob/be7f595a38221b3dd7b1585dc226e47c815dec6e/nodes.py)