Skip to content

Chapter 8: Output Management#

Welcome back! In our journey through the Tutorial-Codebase-Knowledge project, we've covered how you start the process with the Web Interface (Frontend), how it's handled efficiently using Serverless Deployment (Azure Functions) and orchestrated by a Workflow Engine (Pocket Flow). We then saw the initial steps of getting the code via Code Fetching, understanding it with AI Understanding and the AI Brain (LLM Communication), and finally, how the actual text for each chapter is written during Tutorial Content Generation.

So, we now have all the parts of the tutorial – the introduction, the project summary, the relationship diagram, and the content for each individual chapter, all formatted in Markdown text. That's great! But where does this content go? How is it saved so you can actually read it? And how is it organized so the frontend can easily display it?

This is the job of Output Management.

What is Output Management?#

Think of Output Management as the project's librarian and publisher. Once the "authors" (the AI during content generation) have finished writing all the chapters, the Output Management system takes these pieces, puts them together correctly, gives them proper names, organizes them neatly, and puts them in a place where readers (you, using the web interface) can easily find and access them.

Its core responsibilities are:

  1. Collecting and Assembling: Taking the different parts of the generated tutorial (index content, chapter content).
  2. Structuring: Organizing the files into a logical folder structure.
  3. Saving/Publishing: Storing the final files in a persistent location.
  4. Making Accessible: Ensuring the frontend (or a user browsing files) can retrieve the saved tutorial.

The Problem: Generated Content Needs a Home#

Without Output Management, the generated tutorial content would just exist temporarily in the memory of the Azure Function that wrote it. As soon as that function finishes, the content would be gone! We need a way to save it permanently.

Also, simply saving a bunch of files isn't enough. The frontend needs a clear way to know: * Where to find the tutorial for a specific repository. * What files exist (index, chapters). * How to request the content of a specific file.

The files need consistent naming and a predictable location.

The Solution: Saving to Storage#

Our project solves this by saving the generated tutorial files to a dedicated storage location. Depending on the deployment environment, this can be:

  1. Local Filesystem: If you run the project locally, the tutorial files (Markdown files) are saved into a directory on your computer.
  2. Azure Blob Storage: In the cloud deployment (using Azure Functions), the files are uploaded to Azure Blob Storage. This is a cost-effective cloud service for storing large amounts of unstructured data like files. Storing them here makes them easily accessible to the Azure Functions that serve the content to the frontend.

This saving step is the final action in the main tutorial generation workflow.

How Our Project Handles Output Management#

In our Pocket Flow workflow (Chapter 3), the CombineTutorial node is the last node to run. Its name gives you a big hint about its job! It takes the individual chapter contents and combines them into the final tutorial structure, then handles the saving/publishing step.

Let's look at the process orchestrated by this node:

  1. prep Phase: Prepare Files and Structure:

    • The prep method of CombineTutorial reads the project name (shared["project_name"]), the project summary and relationships (shared["relationships"]), the ordered list of abstraction indices (shared["chapter_order"]), the abstraction details (names, descriptions - shared["abstractions"]), and the actual generated Markdown content for each chapter (shared["chapters"]).
    • It uses this information to create the content for the main index.md file. This file includes the project title, the high-level summary, a Mermaid diagram visualizing the relationships between abstractions (generated using the data from shared["relationships"]), and a list of links to all the tutorial chapters.
    • It also prepares a list of dictionaries, one for each chapter, containing the planned filename (e.g., 01_chapter_name.md) and the chapter's Markdown content. The filename is generated consistently based on the chapter number and the (potentially translated) name of the abstraction it covers.
    • Finally, it adds the standard project attribution footer (Generated by [AI Codebase Knowledge Builder]...) to both the index.md content and each chapter's content.

    sequenceDiagram
        participant Shared as Shared Store
        participant CombineNode as CombineTutorial Node
        CombineNode->>Shared: Read "project_name"<br>"relationships"<br>"chapter_order"<br>"abstractions"<br>"chapters"
        CombineNode->>CombineNode: prep(shared)
        Note over CombineNode: Create index.md content<br>(Summary, Diagram, Chapter Links)<br>Create list of chapter file data<br>(Filename, Content)<br>Add attribution to all files
        CombineNode-->>CombineNode: Return {index_content, chapter_files}
    The prep method gathers all the pieces and formats the final files, including the index and filenames.

  2. exec Phase: Save/Upload the Files:

    • The exec method receives the prepared index.md content and the list of chapter file data from prep.
    • It determines the output location. This path is derived from the project_name (e.g., output/my-project or a path within a Blob Storage container like tutorials/my-project).
    • It attempts to upload the files to Azure Blob Storage first. It uses a helper function (upload_to_blob_storage) for this. It uploads index.md and each chapter file individually, placing them within a "folder" (a prefix in blob storage terms) named after the project. It specifies the content type as text/markdown so web browsers interpret them correctly.
    • If the upload to Blob Storage is successful, it also creates a minimal local directory with an info.txt file inside. This info.txt file simply lists the URLs of the files that were uploaded to Blob Storage. This is useful for local debugging or quickly finding the online output.
    • If the upload to Blob Storage fails for any reason (e.g., connection error, missing connection string config), the node includes fallback logic to save the files directly to the local filesystem within a directory named after the project (output/project_name).
    • It returns information about where the files were saved/uploaded (either the Blob Storage details or the local path).

    sequenceDiagram
        participant CombineNode as CombineTutorial Node<br>(exec method)
        participant BlobUtil as upload_to_blob_storage Utility
        participant BlobStorage as Azure Blob Storage
        participant FileSystem as Local File System
    
        CombineNode->>CombineNode: Determine output path/prefix<br>(e.g., tutorials/project_name)
        loop For index.md and each chapter file
            CombineNode->>BlobUtil: call upload_to_blob_storage<br>(container="tutorials", blob="project_name/file.md", content, "text/markdown")
            BlobUtil->>BlobStorage: Upload Blob
            BlobStorage-->>BlobUtil: Success/Error
            alt If upload successful
                BlobUtil-->>CombineNode: Return Blob URL
            else If upload fails
                BlobUtil--xCombineNode: Raise Exception
            end
        end
        alt If ALL uploads successful
            CombineNode->>FileSystem: Create local dir & info.txt<br> (with blob URLs)
            CombineNode-->>CombineNode: Return Blob Info & local path
        else If ANY upload fails
            CombineNode->>FileSystem: Save files locally<br> (index.md, chapters)
            CombineNode-->>CombineNode: Return local path
            Note right of FileSystem: Fallback!
        end
    The exec method orchestrates saving/uploading, prioritizing Blob Storage and falling back to local saves.

  3. post Phase: Update Shared State:

    • The post method receives the result of exec (either the Blob Storage info + local path or just the local path).
    • It stores the primary output location (the local path where the info.txt was saved, or the local save path in fallback mode) in shared["final_output_dir"].
    • If the Blob Storage upload was successful, it also stores the details of the blob storage location (container, path, list of file URLs) in shared["blob_storage_info"].
    • This makes the location of the generated tutorial accessible to any subsequent steps (though CombineTutorial is the last node in the main flow) and available for logging or status updates by the Azure Function host.

    sequenceDiagram
        participant CombineNode as CombineTutorial Node
        participant Shared as Shared Store
        CombineNode->>CombineNode: post(shared, ..., exec_res)
        Note over CombineNode: exec_res is path OR dict with blob info + path
        CombineNode->>Shared: Write "final_output_dir"<br>Write "blob_storage_info" (if applicable)
    The post method stores the final output location information in the shared store.

How the Frontend Accesses Output from Blob Storage#

Once the files are in Azure Blob Storage, the Web Interface (Frontend) doesn't access Blob Storage directly. Instead, it uses the other two Azure Functions we discussed in Chapter 2: Serverless Deployment (Azure Functions):

  • get-output-structure: Called by the frontend to get the list of chapters and filenames for a specific repository by querying Blob Storage for files under that repository's prefix (tutorials/repo_name/).
  • get-output-content: Called by the frontend when you click on a specific chapter link to retrieve the Markdown content of a single file from Blob Storage (tutorials/repo_name/chapter_file.md).

So, Output Management makes the files persistently available, and dedicated Azure Functions act as the secure gateway for the frontend to access them from the cloud storage.

Looking at the Code (function_app/nodes.py and function_app/utils/upload_to_blob_storage.py)#

Let's look at simplified snippets from the CombineTutorial class and the upload_to_blob_storage helper function.

First, the CombineTutorial node:

# function_app/nodes.py (Simplified CombineTutorial)
import os
from pocketflow import Node
from .utils.upload_to_blob_storage import upload_to_blob_storage # Import the helper
import yaml # For relationships (Mermaid diagram)
# ... other imports for diagram generation, file handling ...

class CombineTutorial(Node):
    def prep(self, shared):
        project_name = shared["project_name"]
        output_base_dir = shared.get("output_dir", "output") # Default local dir
        output_path = os.path.join(output_base_dir, project_name) # Path for local fallback/info file

        # Get generated content and analysis results (potentially translated)
        relationships_data = shared["relationships"] # Has summary and details
        chapter_order = shared["chapter_order"] # List of indices
        abstractions = shared["abstractions"] # List of dicts (name, description, files)
        chapters_content = shared["chapters"] # List of Markdown strings

        # --- Code to generate Mermaid Diagram from relationships_data and abstractions ---
        # This involves formatting nodes and edges using abstraction names and relationship labels.
        mermaid_diagram = "flowchart TD\n    A1[\"Node 1\"] --> A2[\"Node 2\"]" # Simplified example
        # --- End Diagram Generation ---

        # --- Code to create index.md content ---
        # Uses project_name, relationships_data (summary), repo_url, mermaid_diagram,
        # chapter_order, abstractions (names), and generates chapter filenames/links.
        index_content = f"# Tutorial: {project_name}\n\n{relationships_data['summary']}\n\n"
        index_content += "```mermaid\n" + mermaid_diagram + "\n```\n\n"
        index_content += "## Chapters\n\n"
        # Loops through chapter_order, builds filenames like "01_chapter_name.md",
        # and adds links like "[Chapter Name](01_chapter_name.md)" to index_content.
        chapter_files_list = [] # List of {"filename": str, "content": str}
        # ... population of chapter_files_list ...
        # --- End index.md content ---

        # Add attribution to all content
        attribution = "\n\n---\n\nGenerated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)"
        index_content += attribution
        for chapter_info in chapter_files_list:
             chapter_info["content"] += attribution

        return {
            "output_path": output_path, # Path for local operations
            "index_content": index_content,
            "chapter_files": chapter_files_list # List of chapter data
        }

    def exec(self, prep_res):
        output_path = prep_res["output_path"] # This is the LOCAL path like output/my-project
        index_content = prep_res["index_content"]
        chapter_files = prep_res["chapter_files"]

        # Determine blob storage details using the local path's basename (project name)
        project_name = os.path.basename(output_path)
        container_name = "tutorials" # Fixed container name
        blob_base_path = project_name # Use project name as the blob path prefix

        print(f"Combining tutorial and uploading to Azure Blob Storage (container: {container_name}, path: {blob_base_path})")

        file_urls = [] # To store URLs of uploaded files

        try:
            # Upload index.md
            index_blob_path = f"{blob_base_path}/index.md"
            index_url = upload_to_blob_storage(
                container_name=container_name,
                blob_name=index_blob_path,
                content=index_content,
                content_type="text/markdown" # Important for web access
            )
            file_urls.append({"file": "index.md", "url": index_url})
            print(f"  - Uploaded index.md to {index_url}")

            # Upload chapter files
            for chapter_info in chapter_files:
                chapter_blob_path = f"{blob_base_path}/{chapter_info['filename']}"
                chapter_url = upload_to_blob_storage(
                    container_name=container_name,
                    blob_name=chapter_blob_path,
                    content=chapter_info["content"],
                    content_type="text/markdown" # Important for web access
                )
                file_urls.append({"file": chapter_info['filename'], "url": chapter_url})
                print(f"  - Uploaded {chapter_info['filename']} to {chapter_url}")

            print(f"\nTutorial generation and upload complete!")

            # Create a minimal local info.txt pointing to blob storage
            try:
                os.makedirs(output_path, exist_ok=True)
                info_content = f"Tutorial '{project_name}' uploaded to Azure Blob Storage.\n\nFiles:\n"
                for file_info in file_urls:
                    info_content += f"- {file_info['file']}: {file_info['url']}\n"
                info_filepath = os.path.join(output_path, "info.txt")
                with open(info_filepath, "w", encoding="utf-8") as f:
                    f.write(info_content)
                print(f"  - Created local reference file: {info_filepath}")
            except Exception as e:
                print(f"Warning: Could not create local reference file: {str(e)}")

            # Return blob info + local path for post
            return {
                "local_path": output_path,
                "blob_container": container_name,
                "blob_path": blob_base_path,
                "files": file_urls # List of {"file": filename, "url": url}
            }

        except Exception as e:
            print(f"Error uploading to Azure Blob Storage: {str(e)}")
            print(f"Falling back to local filesystem...")

            # Fallback to local filesystem if blob storage upload fails
            os.makedirs(output_path, exist_ok=True)

            # Write index.md locally
            index_filepath = os.path.join(output_path, "index.md")
            with open(index_filepath, "w", encoding="utf-8") as f:
                f.write(index_content)
            print(f"  - Wrote {index_filepath}")

            # Write chapter files locally
            for chapter_info in chapter_files:
                chapter_filepath = os.path.join(output_path, chapter_info["filename"])
                with open(chapter_filepath, "w", encoding="utf-8") as f:
                    f.write(chapter_info["content"])
                print(f"  - Wrote {chapter_filepath}")

            # Return just the local path
            return output_path

    def post(self, shared, prep_res, exec_res):
        # exec_res is either the local path string or the dict with blob info + local path
        if isinstance(exec_res, dict):
            shared["final_output_dir"] = exec_res["local_path"] # Store the local info path
            shared["blob_storage_info"] = { # Store the blob details
                "container": exec_res["blob_container"],
                "path": exec_res["blob_path"],
                "files": exec_res["files"] # URLs of the uploaded files
            }
            print(f"\nTutorial generation complete!")
            print(f"Files are uploaded to Azure Blob Storage container '{exec_res['blob_container']}' under path '{exec_res['blob_path']}'")
            print(f"Local reference file: {exec_res['local_path']}/info.txt")
        else:
            # Fallback path
            shared["final_output_dir"] = exec_res
            print(f"\nTutorial generation complete! Files are in: {exec_res}")

# Helper function defined in function_app/utils/upload_to_blob_storage.py
# def upload_to_blob_storage(...): ...
This snippet shows how prep structures the output, and exec attempts the upload to Blob Storage using upload_to_blob_storage, falling back to local saving if necessary. The post method updates the shared state with the location of the output.

Now, let's look at the simplified upload_to_blob_storage helper function, which is called by the CombineTutorial node:

# function_app/utils/upload_to_blob_storage.py (Simplified)
import os
from azure.storage.blob import BlobServiceClient, ContentSettings

def upload_to_blob_storage(container_name, blob_name, content, content_type=None):
    """
    Upload content to Azure Blob Storage.

    Args:
        container_name (str): The container name (e.g., 'tutorials').
        blob_name (str): The name of the blob (e.g., 'my-project/index.md').
        content (str): The string content to upload.
        content_type (str, optional): The MIME type (e.g., 'text/markdown').

    Returns:
        str: The URL of the uploaded blob.
    Raises:
        ValueError: If AzureWebJobsStorage connection string is not set.
        Exception: For any Azure Blob Storage errors.
    """
    # Get the connection string securely from environment variables
    connection_string = os.environ.get("AzureWebJobsStorage")
    if not connection_string:
        # This error will be caught by the calling node's exec and trigger fallback
        raise ValueError("Azure Blob Storage connection string not configured.")

    # Create the Blob Service client
    blob_service_client = BlobServiceClient.from_connection_string(connection_string)

    # Get or create the container
    try:
        container_client = blob_service_client.get_container_client(container_name)
        # Attempt to get properties to check if it exists; will raise if not
        container_client.get_container_properties()
    except Exception:
        # Container doesn't exist, create it
        container_client = blob_service_client.create_container(container_name)

    # Create a client for the specific blob (file)
    blob_client = blob_service_client.get_blob_client(
        container=container_name,
        blob=blob_name
    )

    # Set content type if provided (important for web browsers)
    content_settings = None
    if content_type:
        content_settings = ContentSettings(content_type=content_type)

    print(f"Uploading blob: {blob_name} with content type {content_type or 'None'}")
    # Upload the content, overwriting if it exists
    blob_client.upload_blob(
        content, # Content to upload
        overwrite=True, # Replace existing file if any
        content_settings=content_settings # Apply content type
    )
    print("Upload successful.")

    # Return the public URL of the blob (if container is configured for public access, otherwise requires SAS token)
    # Note: Access in this project's frontend is via GET functions, not direct public access.
    # The URL here is primarily for logging/debugging reference.
    return blob_client.url
This helper function uses the azure-storage-blob library to interact with Blob Storage. It gets the connection string from environment variables, finds or creates the container, gets a client for the target blob name (which includes the project folder structure, like my-project/index.md), sets the content type, and uploads the string content.

Benefits of Output Management#

Feature Benefit Why it matters here
Persistence Saves the generated tutorial files permanently. The tutorial exists beyond the lifetime of the Azure Function execution.
Organization Files are structured logically (e.g., project_name/). Makes it easy to find all files for a specific tutorial.
Accessibility Stores files in a location accessible to the frontend. Enables the web interface to retrieve and display the tutorial content.
Reliability Prioritizes cloud storage with local fallback. Ensures output is saved even if the preferred cloud method temporarily fails.
Metadata Sets content types for correct web display. Browsers know how to interpret the Markdown files (text/markdown).
Attribution Automatically adds a footer to all files. Credits the project creator and provides a link.

Conclusion#

Output Management, handled by the CombineTutorial node and the upload_to_blob_storage utility, is the critical final step in the tutorial generation process. It ensures that the valuable content created by the AI is saved in a persistent, organized, and accessible manner. By prioritizing Azure Blob Storage, it seamlessly integrates with the cloud-based frontend access functions, allowing users to easily view the generated tutorials through the web interface. The fallback to local saving provides robustness, ensuring that even if cloud storage is unavailable, the output is not lost.

You have now completed the core chapters covering the main components of the Tutorial-Codebase-Knowledge project! You've seen how a user request flows from the frontend through a serverless backend, orchestrated by a workflow engine, fetches code, analyzes it with AI, generates content, and finally saves and organizes that output.

While this chapter concludes the core pipeline, you might explore further aspects of the project, such as how file patterns are suggested or how error handling is managed beyond the basics covered here. Congratulations on making it through the tutorial!

Back to Introduction


Generated by AI Codebase Knowledge Builder. References: 1(https://github.com/hieuminh65/Tutorial-Codebase-Knowledge/blob/be7f595a38221b3dd7b1585dc226e47c815dec6e/function_app/function_app.py), 2(https://github.com/hieuminh65/Tutorial-Codebase-Knowledge/blob/be7f595a38221b3dd7b1585dc226e47c815dec6e/function_app/nodes.py), 3(https://github.com/hieuminh65/Tutorial-Codebase-Knowledge/blob/be7f595a38221b3dd7b1585dc226e47c815dec6e/nodes.py)