Skip to content

Chapter 7: Tutorial Content Generation#

Welcome back! We've come a long way. We've seen the Web Interface (Frontend) where you start the process, how the backend is set up using Serverless Deployment (Azure Functions), and how a Workflow Engine (Pocket Flow) orchestrates all the steps. We've also covered the initial steps of getting the raw code via Code Fetching and using AI Understanding to analyze it and figure out the main concepts (Abstractions) and how they relate. Plus, you know how our project talks to the AI Brain using LLM Communication.

Now, we have all the ingredients: * The raw code content. * A list of the main concepts in the code (Abstractions). * An understanding of how those concepts connect (Relationships). * A planned order for the tutorial chapters.

But we don't have the actual tutorial text! We need to turn these ingredients into readable, beginner-friendly explanations, code examples, and diagrams that someone new to the codebase can understand.

This is the core task of Tutorial Content Generation.

What is Tutorial Content Generation?#

Tutorial Content Generation is the step where the project writes the actual chapters of the tutorial. It takes the structured information gathered in the previous steps – especially the identified abstractions, their descriptions, relevant code snippets, and the determined chapter order – and uses the AI to craft the narrative.

Think of it like writing a book based on a detailed outline and research notes. You have the structure (chapter order) and the topics/facts for each chapter (abstractions, relationships, code). Now, you need to write the prose, explaining things clearly, providing examples, and making sure the chapters flow logically from one to the next.

The Problem: Writing a Tutorial is Hard!#

Even with a clear understanding of the codebase's concepts and structure, writing a good tutorial is a skill. You need to:

  • Explain complex technical ideas in simple terms.
  • Use analogies that make sense to beginners.
  • Select and present relevant code snippets clearly.
  • Break down complex processes into step-by-step instructions.
  • Ensure a smooth flow between different topics and chapters.
  • Add diagrams and visualizations where helpful.

Doing this manually for an entire codebase is a massive effort. It requires deep technical understanding and excellent communication skills, specifically tailored for beginners.

The Solution: AI as Your Tutorial Author#

Our project leverages the power of the AI (the LLM) once again, this time to act as the tutorial author. By giving the AI the structured analysis results and very specific instructions (via a detailed prompt), we can automate the process of writing each chapter.

This is handled primarily by one specific Pocket Flow Node: the WriteChapters node.

How It Works: The WriteChapters Node#

In the Pocket Flow workflow (Chapter 3), the WriteChapters node sits after the analysis and ordering steps. Its job is to go through the planned chapter order and write the content for each chapter.

Let's look at the process orchestrated by this node:

  1. Batch Processing (BatchNode): WriteChapters is implemented as a BatchNode. This is useful because the task of writing one chapter is largely independent of writing another (though we do provide context about previous chapters, as we'll see). A BatchNode prepares a list of items to process, and then the exec method is called for each item in that list. In this case, each item represents the job of writing one tutorial chapter.

  2. prep Phase: Prepare the Chapters:

    • The prep method of WriteChapters reads the chapter_order (the list of abstraction indices determined in OrderChapters), the abstractions list (with names, descriptions, and file indices from IdentifyAbstractions), and the original files data from the shared store.
    • It then creates a list of data objects, one for each planned chapter. Each object contains all the information the AI will need to write that specific chapter. This includes:
      • The chapter number.
      • The details (name, description, relevant file indices) of the abstraction that chapter is about.
      • The content of the relevant code files identified for this abstraction.
      • The full list of all chapters with their planned names and filenames (this is crucial for the AI to create links between chapters).
      • Information about the previous and next chapters in the sequence (to help the AI write transitions).
      • The project name and language setting.
    • This list of chapter data objects is what the BatchNode will process, calling exec for each one.

    sequenceDiagram
        participant Shared as Shared Store
        participant WriteNode as WriteChapters Node
        WriteNode->>Shared: Read "chapter_order"<br>"abstractions"<br>"files"<br>"project_name"<br>"language"
        WriteNode->>WriteNode: prep(shared)
        Note over WriteNode: Creates list of chapter job items<br>(1 item per planned chapter)<br>Each item has context for ONE chapter<br>(Abstraction, Code, Full Chapter List, etc.)
        WriteNode-->>WriteNode: Return list of chapter items
    The prep method gathers inputs and prepares a list of tasks, one for each chapter.

  3. exec Phase: Write One Chapter:

    • The exec method is called by the BatchNode framework for each chapter data object prepared by prep. It receives the data for one specific chapter.
    • Inside exec, the code performs the most critical task: it builds a very detailed prompt for the AI. This prompt is like a set of instructions and context specifically for writing that single chapter.
    • The prompt includes:
      • The chapter's number, the project name, and the specific concept (Abstraction name and description) it should explain.
      • The content of the relevant code snippets for this concept.
      • The complete list of all planned chapters including their calculated filenames (e.g., "1. Web Interface (Frontend)"). This is essential so the AI knows the titles and filenames when instructed to create links to other chapters.
      • A summary of the content generated for the previous chapters in this batch run. This is stored in a temporary instance variable (self.chapters_written_so_far) that the BatchNode instance keeps track of. Including this allows the AI to build upon what was just explained, creating a more coherent tutorial flow and writing proper transitions.
      • Detailed instructions on how to write the chapter:
        • Use a beginner-friendly tone.
        • Start with a high-level problem/use case.
        • Break down complex ideas.
        • Explain how to use the concept.
        • Provide simple, short code examples (under 20 lines!) and explain them.
        • Describe the internal implementation, perhaps with a simple sequence diagram.
        • Crucially, use Markdown links to reference other chapters, looking up the correct title and filename from the provided full chapter list.
        • Use analogies and Mermaid diagrams.
        • Write a conclusion and transition to the next chapter, again using a Markdown link if applicable.
        • Specify the output format should only be the Markdown text for that chapter.
        • Include language instructions if the target language is not English.
    • The exec method then calls the call_llm utility (Chapter 6) with this carefully crafted prompt.
    • It receives the raw Markdown text response from the AI.
    • It performs basic validation or cleanup (like ensuring the chapter starts with the correct Markdown heading).
    • It adds the generated Markdown content for this chapter to the temporary list (self.chapters_written_so_far). This content will then be available as "Context from previous chapters" when the exec method runs for the next chapter in the batch.
    • It returns the generated Markdown content for this specific chapter.

    sequenceDiagram
        participant WriteNode as WriteChapters Node<br>(exec method, running for 1 chapter)
        participant LLM_Util as call_llm Utility
        participant LLM as Large Language Model (AI)
    
        WriteNode->>WriteNode: Get current chapter's item data<br>Get self.chapters_written_so_far<br>Build detailed prompt<br>(Concept, Code, Full Structure, Previous Summary, Instructions)
        WriteNode->>LLM_Util: call_llm(prompt)
        LLM_Util->>LLM: Send prompt for ONE chapter
        LLM-->>LLM_Util: Return raw Markdown text
        LLM_Util-->>WriteNode: Return text
        WriteNode->>WriteNode: Validate/Cleanup text<br>Add text to self.chapters_written_so_far
        WriteNode-->>WriteNode: Return generated chapter text
    The exec method runs for each chapter item, builds a detailed prompt including context from previous chapters, calls the AI, gets the content, and updates the context for the next chapter.

  4. post Phase: Collect All Chapters:

    • After the exec method has successfully run for all the chapter items prepared by prep, the post method of WriteChapters is called.
    • The BatchNode framework automatically collects all the return values from every exec call and passes them to post as a list (exec_res_list). This list contains the generated Markdown content for every single chapter, in the order they were processed.
    • The post method takes this list of chapter contents and stores it in the shared dictionary under the key "chapters".
    • It also cleans up the temporary self.chapters_written_so_far list.

    sequenceDiagram
        participant WriteNode as WriteChapters Node
        participant Shared as Shared Store
        WriteNode->>WriteNode: post(shared, ..., exec_res_list)<br>exec_res_list contains ALL chapter contents
        WriteNode->>Shared: Write "chapters"<br>(List of all chapter Markdown contents)
        WriteNode->>WriteNode: Clean up self.chapters_written_so_far
    The post method is called once after all chapters are written, collecting the results and storing them in the shared store.

By using a BatchNode and carefully managing the context passed to the AI (the full structure for linking and the cumulative summary of previous chapters), the WriteChapters node efficiently generates the complete tutorial content, chapter by chapter.

Looking at the Code (function_app/nodes.py)#

Let's look at simplified snippets from the WriteChapters class in function_app/nodes.py to see these steps reflected in code.

# function_app/nodes.py (Simplified WriteChapters)
from pocketflow import BatchNode # Note: It's a BatchNode!
from utils.call_llm import call_llm # We use the LLM communication utility
import os # For path joining, used in filename generation

class WriteChapters(BatchNode):
    def prep(self, shared):
        chapter_order = shared["chapter_order"] # List of indices for chapter order
        abstractions = shared["abstractions"]   # List of identified concepts
        files_data = shared["files"] # Original file content
        language = shared.get("language", "english") # Language setting

        self.chapters_written_so_far = [] # Temporary storage for previous chapters summary

        # --- Prepare full chapter list and filenames for linking ---
        all_chapters = []
        chapter_filenames = {} # Map abstraction index to filename
        for i, abstraction_index in enumerate(chapter_order):
            if 0 <= abstraction_index < len(abstractions):
                chapter_num = i + 1
                chapter_name = abstractions[abstraction_index]["name"] # Get concept name
                # Create filename from concept name (e.g., "01_web_interface.md")
                safe_name = "".join(c if c.isalnum() else '_' for c in chapter_name).lower()
                filename = f"{i+1:02d}_{safe_name}.md"
                # Store for linking: "[Chapter Title](filename.md)"
                all_chapters.append(f"{chapter_num}. [{chapter_name}]({filename})")
                # Store mapping for easy lookup in exec
                chapter_filenames[abstraction_index] = {"num": chapter_num, "name": chapter_name, "filename": filename}

        full_chapter_listing = "\n".join(all_chapters) # This goes into the prompt

        # --- Create list of items for the batch process (1 item per chapter) ---
        items_to_process = []
        for i, abstraction_index in enumerate(chapter_order):
            if 0 <= abstraction_index < len(abstractions):
                abstraction_details = abstractions[abstraction_index] # Concept details for this chapter
                related_file_indices = abstraction_details.get("files", []) # Indices of relevant files
                # Get content for relevant files using helper function
                related_files_content_map = get_content_for_indices(files_data, related_file_indices)

                # Prepare data for this specific chapter's item
                items_to_process.append({
                    "chapter_num": i + 1,
                    "abstraction_index": abstraction_index,
                    "abstraction_details": abstraction_details,
                    "related_files_content_map": related_files_content_map,
                    "project_name": shared["project_name"],
                    "full_chapter_listing": full_chapter_listing, # Needed for linking
                    "chapter_filenames": chapter_filenames, # Needed for linking
                    "language": language,
                    # Note: 'previous_chapters_summary' is NOT added here, it's generated in exec
                })
            # else: handle invalid index... (omitted for brevity)

        print(f"Preparing to write {len(items_to_process)} chapters...")
        return items_to_process # Return the list of items

    def exec(self, item):
        # This runs for EACH item (chapter)
        abstraction_name = item["abstraction_details"]["name"]
        abstraction_description = item["abstraction_details"]["description"]
        chapter_num = item["chapter_num"]
        project_name = item.get("project_name")
        language = item.get("language", "english")

        print(f"Writing chapter {chapter_num} for: {abstraction_name} using LLM...")

        # Get the content generated for previous chapters (stored in the instance variable)
        # This provides context for writing transitions
        previous_chapters_summary = "\n---\n".join(self.chapters_written_so_far)

        # Format relevant code snippets for the prompt
        file_context_str = "\n\n".join(
            f"--- File: {idx_path.split('# ')[1] if '# ' in idx_path else idx_path} ---\n{content}"
            for idx_path, content in item["related_files_content_map"].items()
        )

        # --- Build the detailed prompt for the AI (Simplified) ---
        # The prompt is quite long and contains all the instructions mentioned earlier.
        # It includes:
        # - Language instructions (if not English)
        # - The chapter heading structure
        # - The concept details (name, description)
        # - The FULL chapter listing for linking
        # - The summary of previously written chapters (previous_chapters_summary)
        # - The relevant code snippets (file_context_str)
        # - Detailed instructions on content, tone, code limits, diagrams, cross-linking, etc.
        # The actual prompt is complex, but the key is it combines ALL necessary context and instructions.
        prompt = f"""
        Write a very beginner-friendly tutorial chapter (in Markdown format) for the project `{project_name}` about the concept: "{abstraction_name}". This is Chapter {chapter_num}.

        ... (Rest of the detailed instructions and context like the actual prompt in the file) ...

        Context from previous chapters:
        {previous_chapters_summary if previous_chapters_summary else "This is the first chapter."}

        Relevant Code Snippets (Code itself remains unchanged):
        {file_context_str if file_context_str else "No specific code snippets provided for this abstraction."}

        ... (More instructions on format, linking, diagrams, etc.) ...

        Output *only* the Markdown content for this chapter.
        """ # This is a highly simplified representation of the full prompt string

        chapter_content = call_llm(prompt) # Call the AI!

        # Basic validation/cleanup (like checking/adding the heading)
        actual_heading = f"# Chapter {chapter_num}: {abstraction_name}"
        if not chapter_content.strip().startswith(actual_heading):
             # Add heading if missing or incorrect... (simplified)
             chapter_content = f"{actual_heading}\n\n{chapter_content}"

        # Add the generated content to the temporary list for the NEXT chapter's context
        self.chapters_written_so_far.append(chapter_content)

        return chapter_content # Return the content for THIS chapter

    def post(self, shared, prep_res, exec_res_list):
        # This runs AFTER all exec calls are complete
        # exec_res_list is a list containing the return value of each exec call (the chapter content)
        shared["chapters"] = exec_res_list # Store the final list of chapter contents
        # Clean up the temporary storage
        del self.chapters_written_so_far
        print(f"Finished writing {len(exec_res_list)} chapters.")

# Helper function defined elsewhere in nodes.py (used by prep)
# def get_content_for_indices(files_data, indices): ...

This code shows how prep sets up the list of chapter jobs. The exec method is the core, building a rich prompt for each chapter, calling call_llm, and crucially adding the result to self.chapters_written_so_far to build context for subsequent exec calls in the batch. Finally, post collects all the results from the batch and saves them.

Key Aspects for Beginner-Friendly Content#

The intelligence for making the content beginner-friendly doesn't just come from the AI itself, but from the instructions given in the prompt built within the exec method. These instructions guide the AI to:

  • Use simple language and analogies.
  • Focus on use cases and practical explanations.
  • Keep code examples short and provide explanations after them.
  • Break down complex ideas.
  • Suggest using visual aids like Mermaid diagrams.
  • Ensure proper linking between chapters using the provided structure.
  • Maintain a welcoming tone.

These detailed instructions are what translates the raw analysis into a structured, easy-to-follow tutorial aimed at someone new to the codebase.

Benefits of this Approach#

  • Automation: Automatically generates entire tutorial chapters, saving huge amounts of manual writing time.
  • Consistency: Follows a defined structure and tone guided by the prompt instructions.
  • Leverages Analysis: Directly uses the output of code analysis and ordering, ensuring the tutorial is based on the actual codebase structure.
  • Contextual Flow: Provides context from previous chapters to the AI, helping create smoother transitions and a more connected narrative.
  • Scalability: A BatchNode can potentially process multiple chapters in parallel (though our current implementation makes it sequential within the batch for context building), which can speed up the process for larger projects.

Conclusion#

Tutorial Content Generation is where the pieces come together! Leveraging the power of AI through the WriteChapters node, the project takes the structured understanding of the codebase (Abstractions, Relationships, Order) and transforms it into the actual Markdown text for each tutorial chapter. By providing a detailed prompt with all necessary context and specific instructions for beginner-friendly writing, the AI generates the content, including explanations, code snippets, diagrams, and crucial links between chapters.

With the chapters written, the final step is to collect all these generated files and make them available. That's what we'll cover in the next chapter.

Next Chapter: Output Management


Generated by AI Codebase Knowledge Builder. References: 1(https://github.com/hieuminh65/Tutorial-Codebase-Knowledge/blob/be7f595a38221b3dd7b1585dc226e47c815dec6e/function_app/nodes.py), 2(https://github.com/hieuminh65/Tutorial-Codebase-Knowledge/blob/be7f595a38221b3dd7b1585dc226e47c815dec6e/nodes.py)