Skip to content

System for Generative AI Assisted Documents and Reports With Rich Visual Information and Feedback Improvements

Technical/scientific Challenge:

    The previous iteration of the platform successfully automated the generation of high-quality textual reports enriched with Python-generated, data-driven charts. The workflow dispatched user prompts to multiple LLMs, merged the outputs, performed quality assurance, and generated relevant visualizations in a secure sandbox.

    However, this established process faced new limitations. The final documents, while factually and grammatically sound, sometimes lacked a holistic, expert-level polish. There was no mechanism to check for strategic alignment between the text and visuals or to ensure the most persuasive narrative was presented. The workflow was linear, meaning any inconsistencies identified by a human reviewer required a manual restart of several steps. Furthermore, the platform was restricted to programmatic charts (matplotlib, graphviz), leaving a significant opportunity to incorporate more descriptive and engaging visuals like infographics and conceptual diagrams.

    Solution:

    To address these challenges, the system was upgraded with a multi-agent, iterative refinement architecture. This new process introduces an AI-powered critical review step and a parallel visual generation stream.

    User-facing Web Interface – Enhanced Visual & Quality Options

    The front-end was redesigned with an expanded “Visual Options” section and a new “Review Options” panel:

    Visual Generation Mode, the users can now choose between:

    • Data-Driven Charts (Python code generation) – for precise, data-backed visualizations like bar, line, and heat maps.
    • Illustrative Visuals (Generative AI) – for conceptual graphics, process flows, and infographics.
    • Quality Control – a new option to enable a final “AI Critical Review” pass to enhance the coherence and impact of the final document.

    Back-end Pipeline – Dual-Stream Generation with Iterative Refinement

    Initial Draft Generation

    The already used pipeline for prompt dispatch, text merging, and automated QA (Pass 1) remains the foundation.

    Dual-Stream Visual Generation

    After the initial text is finalized, the system initiates two parallel visual generation processes:

    • Path A: Python Code Generator: The existing Semantic Visual Planner identifies data points and generates Python scripts for execution in the secure sandbox. This path continues to handle precise, data-driven charts.
    • Path B: Infographics Generator: In parallel, a prompt is sent to a state-of-the-art multimodal model, Gemini Flash 2.5, to create illustrative visuals. This model is good at synthesizing information from the text to produce high-quality infographics, flowcharts, and custom diagrams that explain complex processes or concepts visually.

    First Draft Assembly

    The generated text and all visuals from both Path A and Path B are compiled into a single draft document.

    State-of-the-art reasoning LLM as Critical Reviewer

    This is an important element of the new, improved workflow. The complete draft is submitted to a high-reasoning LLM (e.g., OpenAI o1/o3 or Gemini 2.5 Pro) tasked with acting as an expert reviewer. This “Reviewer Agent” assesses the document against the original user requirements for:

    • Coherence: Does the narrative flow logically? Is there a consistent tone?
    • Text-Visual Synergy: Do the charts and infographics effectively support the text? Could a different visual type be more impactful?
    • Completeness: Are there any unsupported claims or missing explanations?
    • Impact: Does the document effectively communicate its key message?

    The output is a structured JSON object containing specific, actionable feedback (e.g., {“action”: “replace_chart”, “figure_id”: “FIG_3”, “suggestion”: “Replace bar chart with a line chart to better show the trend over time.”}).

    Iterative Refinement Loop

    The structured feedback from the Reviewer Agent triggers an automated revision cycle. The system routes the feedback to the appropriate module – be it the text generator for textual adjustments or the visual generators for recreating specific figures. This loop can run for a predefined number of iterations or until the Reviewer Agent approves the document, ensuring the final output is polished and aligned without human intervention.

    Final Assembly

    The fully reviewed and refined text and visuals are assembled into the final document, complete with updated captions and accessibility text.

    • Scientific impact:
    • Demonstrates a multi-agent “Generator-Reviewer” AI workflow – provides a practical implementation of an AI system where one set of agents generates content and another critically reviews and directs revisions, mimicking human collaboration.
    • Showcases hybrid visual generation – establishes a novel, dual-stream approach that combines the precision of programmatic, code-generated charts with the creative power of direct-to-image synthesis for infographics.
    • Establishes a model for automated iterative refinement – the feedback loop, driven by a reasoning LLM, introduces a scalable method for self-correction and quality enhancement in automated document creation, significantly raising the bar for machine-generated reports
    • Benefits:
    • The AI Critical Review step ensures the final reports are not just error-free but are also coherent, impactful, and aligned with the user goals.
    • The ability to generate infographics and custom diagrams allows for new, more effective ways of visual storytelling, making complex information easier to understand.
    • The automated refinement loop allows human experts to move from being editors to final approvers, so this would accelerate project delivery.
    • By maintaining the sandboxed execution for Python and leveraging APIs for image generation, the system extends its capabilities without compromising on security.

    Success story # Highlights:

    • Introduces a “Critical Reviewer” step where a state-of-the-art reasoning LLM performs a holistic quality check on the entire document.
    • Implements an automated feedback loop for iterative refinement, allowing the system to self-correct and improve text and visuals based on AI-generated suggestions.
    • Adds a parallel, multimodal visual generation stream to create rich infographics and conceptual diagrams alongside data-driven charts.

    Sample output – part 1.

    Sample output – part 2.

    An infographic, generated by Gemini Flash 2.5 as part of the document.

    Contact: