DreamCanvas setup : to be reviewed

This commit is contained in:
2026-01-21 19:20:00 +01:00
parent b580137ee8
commit 412fa82ff3
22 changed files with 2023 additions and 0 deletions

View File

@@ -0,0 +1,138 @@
## **Architecture and Folder Structure**
### **Overview**
The application is built using the **FastAPI** framework for the backend, allowing it to handle HTTP requests efficiently. The application interacts with external APIs, including an LLM (Large Language Model) service for generating creative prompts and a WebSocket-based service for generating images based on user input.
The **modular architecture** breaks the code into components based on functionality, which promotes separation of concerns, improves maintainability, and scales better as the application grows.
### **Folder Structure**
```bash
├── backend/
│ ├── __init__.py
│ ├── main.py # Entry point for FastAPI and application startup
│ ├── routes.py # Contains all the route definitions for the API
│ ├── services.py # Contains the business logic, interactions with external services like LLMs, image generation, etc.
│ ├── models.py # Defines Pydantic models like LLMRequest, PromptRequest, etc.
├── Dockerfile
├── .dockerignore
├── .env
├── .gitignore
├── Readme.md
├── requirements.txt
├── workflow.json
├── quick_prompts.json
├── ui/
│ ├── index.html
│ ├── script.js
│ ├── style.css
```
### **Why This Folder Structure?**
- **Separation of Concerns**: By splitting the application into **routes**, **services**, and **models**, each file has a focused responsibility:
- `routes.py` handles request routing and endpoint definitions.
- `services.py` handles the core business logic, interactions with external services, and complex operations.
- `models.py` contains the data models that define the request and response formats.
- **Scalability**: As the application grows, this structure allows easy addition of new features without bloating any single file. For instance, adding new routes can be done by simply extending `routes.py` without touching the core logic in `services.py`.
- **Reusability**: The modular approach allows the business logic in `services.py` to be reused in different routes or even across multiple applications without needing to rewrite the code.
- **Maintainability**: When each file has a single responsibility, debugging and extending the application become much easier. If a bug occurs in the prompt generation, we know that we only need to investigate `services.py`. Similarly, changes to how data is structured are confined to `models.py`.
---
## **Flow of the Application**
1. **Application Entry Point** (`main.py`)
- The **FastAPI** application is initialized in `main.py`. It serves as the entry point for the entire backend, mounting the `routes.py` via the `include_router()` function.
- Static files (like HTML, CSS, and JS) are mounted to serve the front-end resources.
```python
app.mount("/static", StaticFiles(directory="ui"), name="static")
```
2. **Routing and Request Handling** (`routes.py`)
- The `routes.py` file defines all the HTTP routes, such as:
- `GET /`: Serves the main HTML file (`index.html`).
- `GET /quick_prompts/`: Serves predefined prompts from the `quick_prompts.json` file.
- `POST /ask_llm/`: Handles requests to the LLM to generate creative ideas based on user input.
- `POST /generate_images/`: Triggers the image generation process.
- The routes define what action should be taken when the FastAPI server receives an HTTP request, but the core business logic is abstracted into `services.py`.
3. **Business Logic and External Services** (`services.py`)
- **Core Logic**: This file handles the interactions with external services (LLM, WebSocket server) and encapsulates the complex operations.
- **Why separate business logic?**: The logic of sending requests to the LLM, generating images, and fetching results from WebSocket is often intricate and should not be mixed with routing. By isolating this functionality in `services.py`, the code becomes more maintainable and reusable.
- **LLM Interaction**:
- The `ask_llm_service` function interacts with the Ollama server to send a prompt and retrieve a creative response. This is done by making a POST request to the LLM service's API and processing the response.
- The separation of this logic means you can easily change the external LLM service in the future without altering the core routing code.
```python
response = requests.post(ollama_server_url, json=ollama_request, headers={"Content-Type": "application/json"})
response_data = response.json()
```
- **Image Generation**:
- The `generate_images` function handles image generation. It connects to a WebSocket server, sends a prompt, and processes the responses, eventually returning the images in a suitable format.
- Again, this complex operation is encapsulated in `services.py`, separating the logic from the routing layer.
4. **Data Models and Validation** (`models.py`)
- This file defines the structure of the data used in the application, using **Pydantic** models to enforce strict validation of input.
- **Why Pydantic models?**: They allow for automatic validation of request bodies, making the application more robust by rejecting invalid input before it even reaches the business logic.
- **Models**:
- `LLMRequest`: Defines the schema for a request to the LLM service, ensuring that a `positive_prompt` is provided.
- `PromptRequest`: Defines the schema for the image generation request, ensuring that `positive_prompt`, `negative_prompt`, and other parameters (e.g., image resolution) are valid.
```python
class LLMRequest(BaseModel):
positive_prompt: str
class PromptRequest(BaseModel):
positive_prompt: str
negative_prompt: str
steps: int = 25
width: int = 512
height: int = 512
```
---
## **Request-Response Flow**
### 1. **Frontend Request**
- A request comes from the front-end (served from `ui/index.html`).
- Examples:
- When the user submits a text prompt for LLM: `POST /ask_llm/`
- When the user requests image generation: `POST /generate_images/`
### 2. **Routing Layer** (`routes.py`)
- The incoming HTTP request is routed by FastAPI, which directs it to the appropriate endpoint handler in `routes.py`.
- **Example**: A request to `POST /ask_llm/` is handled by the `ask_llm` function, which parses the request data and then calls the corresponding function in `services.py`.
### 3. **Business Logic Layer** (`services.py`)
- The service layer handles the core operations:
- If the request involves an LLM, `ask_llm_service()` sends a POST request to the LLM API.
- If the request is for image generation, `generate_images()` opens a WebSocket connection, interacts with the image generation service, and processes the result.
### 4. **Response Generation**
- After processing in the service layer, the results (e.g., LLM response, images) are returned to `routes.py`.
- The route function wraps the results in a suitable response object (e.g., `JSONResponse`, `StreamingResponse`) and returns it to the frontend.
- The front-end then updates based on the server's response.
---
This architecture and folder structure prioritize **clarity, scalability, and maintainability**. By separating the application into layers, each responsible for a single aspect of the program's functionality, we can easily add new features, modify existing functionality, and debug any issues.
- **Routes** handle request dispatching and control the API's behavior.
- **Services** handle all the business logic and complex operations.
- **Models** ensure data consistency and validation, leading to fewer errors.
This modular structure also enables easier testing and deployment since each module can be tested individually without breaking the overall flow of the application.

View File

@@ -0,0 +1,109 @@
## **Running the Application with and without Docker**
This guide explains how to run the application both locally and via Docker, using the command `docker run -d -p 8000:8000 --name dreamcanvas dreamcanvas`. It also covers how to test the application via a web browser and cURL.
---
### **Running the Application Locally**
1. **Install Dependencies**:
Ensure you have Python and all required dependencies installed. In the root of your project directory, run:
```bash
pip install -r requirements.txt
```
2. **Start the Application**:
Run the following command to start the FastAPI server using Uvicorn:
```bash
uvicorn backend.main:app --reload --host 0.0.0.0 --port 8000
```
3. **Test the Application via Browser**:
Open a web browser and navigate to `http://localhost:8000/docs`. This will bring up **Swagger UI**, where you can interact with the API endpoints.
4. **Test the Application via cURL**:
To test the `/generate_images/` endpoint, run this **cURL** command:
```bash
curl 'http://localhost:8000/generate_images/' \
-H 'Content-Type: application/json' \
--data-raw '{
"positive_prompt": "Pretty woman in her late 20s 4k highly detailed hyperrealistic",
"negative_prompt": "Trees low quality blurry watermark",
"steps": 25,
"width": 512,
"height": 512
}' \
--insecure
```
---
### **Running the Application with Docker**
You can also run the application inside a Docker container, which ensures a consistent environment for deployment and testing.
#### **1. Build the Docker Image**
First, ensure you have Docker installed. Navigate to your project root directory (where the `Dockerfile` is located), and run the following command to build the Docker image:
```bash
docker build -t dreamcanvas .
```
This command will create a Docker image named `dreamcanvas`.
#### **2. Run the Docker Container**
Once the image is built, you can start the application in detached mode using this command:
```bash
docker run -d -p 8000:8000 --name dreamcanvas dreamcanvas
```
- `-d` runs the container in detached mode (in the background).
- `-p 8000:8000` maps port 8000 on your host machine to port 8000 inside the Docker container, making the application accessible via `http://localhost:8000`.
- `--name dreamcanvas` assigns a name to the running container, making it easier to manage.
#### **3. Test the Application in Browser**
Open your web browser and navigate to `http://localhost:8000/docs` to access **Swagger UI**. This interface allows you to interact with all the available API endpoints.
#### **4. Test the Application via cURL**
To test the `/generate_images/` endpoint, use this **cURL** command:
```bash
curl 'http://localhost:8000/generate_images/' \
-H 'Content-Type: application/json' \
--data-raw '{
"positive_prompt": "Pretty woman in her late 20s 4k highly detailed hyperrealistic",
"negative_prompt": "Trees low quality blurry watermark",
"steps": 25,
"width": 512,
"height": 512
}' \
--insecure
```
---
### **Stopping and Managing the Docker Container**
#### **Stop the Container**
If you need to stop the container, you can run:
```bash
docker stop dreamcanvas
```
#### **Restart the Container**
To restart the stopped container, use:
```bash
docker start dreamcanvas
```
#### **Remove the Container**
To remove the container when you're done testing, run:
```bash
docker rm -f dreamcanvas
```
This will stop and remove the container completely.

View File

@@ -0,0 +1,153 @@
# ComfyUI Docker Setup with GGUF Support and ComfyUI Manager
This guide provides detailed steps to build and run **ComfyUI** with **GGUF support** and **ComfyUI Manager** using Docker. The GGUF format is optimized for quantized models, and ComfyUI Manager is included for easy node management.
## Prerequisites
Before starting, ensure you have the following installed on your system:
- **Docker**
- **NVIDIA GPU with CUDA support** (if using GPU acceleration)
### 1. Clone the ComfyUI Repository
First, clone the ComfyUI repository to your local machine:
```bash
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
```
### 2. Create the Dockerfile
Create a `Dockerfile` in the root of your ComfyUI directory with the following content:
```Dockerfile
# Base image with Python 3.11 and CUDA 12.5 support
FROM nvidia/cuda:12.5.0-runtime-ubuntu22.04
# Install system dependencies
RUN apt-get update && apt-get install -y \
git \
python3-pip \
libgl1-mesa-glx \
&& rm -rf /var/lib/apt/lists/*
# Set working directory
WORKDIR /app
# Copy the cloned ComfyUI repository
COPY . /app
# Install Python dependencies
RUN pip install --upgrade pip
RUN pip install -r requirements.txt
# Clone and install ComfyUI Manager
RUN git clone https://github.com/ltdrdata/ComfyUI-Manager.git /app/custom_nodes/ComfyUI-Manager && \
pip install -r /app/custom_nodes/ComfyUI-Manager/requirements.txt
# Clone and install GGUF support for ComfyUI
RUN git clone https://github.com/city96/ComfyUI-GGUF.git /app/custom_nodes/ComfyUI-GGUF && \
pip install --upgrade gguf
# Expose the port used by ComfyUI
EXPOSE 8188
# Run ComfyUI with the server binding to 0.0.0.0
CMD ["python3", "main.py", "--listen", "0.0.0.0"]
```
### 3. Build the Docker Image
Navigate to the directory where the `Dockerfile` is located and build the Docker image:
```bash
docker build -t comfyui-gguf:latest .
```
This will create a Docker image named `comfyui-gguf:latest` with both **ComfyUI Manager** and **GGUF support** built in.
### 4. Run the Docker Container
Once the image is built, you can run the Docker container with volume mapping for your models.
```bash
docker run --name comfyui -p 8188:8188 --gpus all \
-v /home/mukul/dev-ai/vison/models:/app/models \
-d comfyui-gguf:latest
```
This command maps your local `models` directory to `/app/models` inside the container and exposes ComfyUI on port `8188`.
### 5. Download and Place Checkpoint Models
To use GGUF models or other safetensor models, follow the steps below to download them directly into the `checkpoints` directory.
1. **Navigate to the Checkpoints Directory**:
```bash
cd /home/mukul/dev-ai/vison/models/checkpoints
```
2. **Download `flux1-schnell-fp8.safetensors`**:
```bash
wget https://huggingface.co/Comfy-Org/flux1-schnell/resolve/main/flux1-schnell-fp8.safetensors?download=true -O flux1-schnell-fp8.safetensors
```
3. **Download `flux1-dev-fp8.safetensors`**:
```bash
wget https://huggingface.co/Comfy-Org/flux1-dev/resolve/main/flux1-dev-fp8.safetensors?download=true -O flux1-dev-fp8.safetensors
```
These commands will place the corresponding `.safetensors` files into the `checkpoints` directory.
### 6. Access ComfyUI
After starting the container, access the ComfyUI interface in your web browser:
```bash
http://<your-server-ip>:8188
```
Replace `<your-server-ip>` with your server's IP address or use `localhost` if you're running it locally.
### 7. Using GGUF Models
In the ComfyUI interface:
- Use the **UnetLoaderGGUF** node (found in the `bootleg` category) to load GGUF models.
- Ensure your GGUF files are correctly named and placed in the `/app/models/checkpoints` directory for detection by the loader node.
### 8. Managing Nodes with ComfyUI Manager
With **ComfyUI Manager** built into the image:
- **Install** missing nodes as needed when uploading workflows.
- **Enable/Disable** conflicting nodes from the ComfyUI Manager interface.
### 9. Stopping and Restarting the Docker Container
To stop the running container:
```bash
docker stop comfyui
```
To restart the container:
```bash
docker start comfyui
```
### 10. Logs and Troubleshooting
To view the container logs:
```bash
docker logs comfyui
```
This will provide details if anything goes wrong or if you encounter issues with GGUF models or node management.
---
This `README.md` provides the complete steps to set up **ComfyUI with GGUF support** in Docker, with instructions for downloading models into the checkpoints directory and managing nodes using ComfyUI Manager.

View File

@@ -0,0 +1,33 @@
# Base image with Python 3.11 and CUDA 12.5 support
FROM nvidia/cuda:12.5.0-runtime-ubuntu22.04
# Install system dependencies
RUN apt-get update && apt-get install -y \
git \
python3-pip \
libgl1-mesa-glx \
&& rm -rf /var/lib/apt/lists/*
# Set working directory
WORKDIR /app
# Copy the cloned ComfyUI repository
COPY . /app
# Install Python dependencies
RUN pip install --upgrade pip
RUN pip install -r requirements.txt
# Clone and install ComfyUI Manager
RUN git clone https://github.com/ltdrdata/ComfyUI-Manager.git /app/custom_nodes/ComfyUI-Manager && \
pip install -r /app/custom_nodes/ComfyUI-Manager/requirements.txt
# Clone and install GGUF support for ComfyUI
RUN git clone https://github.com/city96/ComfyUI-GGUF.git /app/custom_nodes/ComfyUI-GGUF && \
pip install --upgrade gguf
# Expose the port used by ComfyUI
EXPOSE 8188
# Run ComfyUI with the server binding to 0.0.0.0
CMD ["python3", "main.py", "--listen", "0.0.0.0"]

View File

@@ -0,0 +1,186 @@
## JavaScript Methods Documentation for **DreamCanvas**
### Overview
This documentation provides an overview of the key methods in the JavaScript file responsible for handling user interactions, form submissions, image generation, and dynamic prompt loading. It also explains how these methods are connected to the elements defined in the HTML structure.
These JavaScript methods drive the interactive functionality of the **DreamCanvas** app, handling form submissions, prompt input, and dynamic image generation. They are closely connected to the corresponding HTML elements, enabling a smooth and responsive user experience.
---
### **1. `startTimer()`**
**Description:**
Starts a timer that displays the elapsed time since the image generation process started.
- **How it works:**
- Captures the current time when the generation process begins.
- Updates the elapsed time display every second.
- This is useful to show the user how long the process is taking.
- **Connected HTML Elements:**
- `elapsedTime`: This element is updated by the method to display the current time in the format `(MM:SS)`.
---
### **2. `stopTimer()`**
**Description:**
Stops the timer once the image generation process is complete.
- **How it works:**
- Clears the interval that updates the timer.
- Hides the timer display by adding the CSS class `d-none`.
- **Connected HTML Elements:**
- `elapsedTime`: The timer display is hidden after this method is called.
---
### **3. `showNotification(message, type)`**
**Description:**
Displays a notification to the user with a custom message and type (success or error).
- **Parameters:**
- `message`: The message to be displayed in the notification.
- `type`: The type of notification (`success` or `danger`), which determines the visual style of the alert.
- **How it works:**
- Sets the content and type of the notification element.
- Automatically hides the notification after 3 seconds.
- **Connected HTML Elements:**
- `notification`: This is the element where the notification message is displayed.
---
### **4. `resetBtn.addEventListener("click", function () { ... })`**
**Description:**
Handles the form reset action.
- **How it works:**
- Clears the input fields for prompts.
- Hides the image result, prompts, and navigation buttons.
- Resets the button text and state.
- Resets any cached image history and disables image navigation.
- Displays a notification confirming that the form has been reset.
- **Connected HTML Elements:**
- `positivePrompt` & `negativePrompt`: The input fields for prompts are cleared.
- `generatedImage`: Hides the generated image.
- `promptDisplay`: Hides the display of positive and negative prompts.
- `imageNavigation`: Hides the navigation buttons for switching between images.
---
### **5. `loadQuickPrompts()`**
**Description:**
Loads quick prompts from the server and populates the respective buttons dynamically.
- **How it works:**
- Fetches quick prompts (positive, negative, and others) from the server.
- Dynamically generates buttons based on the prompt categories.
- Each button, when clicked, adds its respective prompt to the corresponding input field.
- **Connected HTML Elements:**
- `quickPromptsContainer`: Contains the buttons for different categories of quick prompts.
- `positiveKeywords` & `negativeKeywords`: Containers for dynamically generated prompt buttons.
---
### **6. `addPositiveKeyword(button, keyword)`**
**Description:**
Appends a positive keyword to the `positivePrompt` input field when a quick prompt button is clicked.
- **How it works:**
- Updates the `positivePrompt` input field with the selected keyword.
- Disables the button after it has been clicked to prevent multiple additions of the same keyword.
- **Connected HTML Elements:**
- `positivePrompt`: The field where the selected positive prompt keyword is added.
---
### **7. `askLLMButton.addEventListener("click", function () { ... })`**
**Description:**
Handles the action of asking the LLM (Language Learning Model) for a creative idea based on the current positive prompt.
- **How it works:**
- Sends the current positive prompt (or a default one) to the server.
- Displays a spinner to indicate processing.
- Receives the LLM-generated prompt from the server and updates the `llmResponseTextarea`.
- Shows the "Use LLM's Creative Prompt" button to allow the user to apply the prompt.
- **Connected HTML Elements:**
- `positivePrompt`: The input value is sent to the server as part of the request.
- `llmResponseTextarea`: Displays the response from the LLM.
- `askLLMSpinner`: Shows the spinner while waiting for the response.
- `useLLMResponseButton`: Becomes visible once the response is ready.
---
### **8. `useLLMResponseButton.addEventListener("click", function () { ... })`**
**Description:**
Uses the LLM's generated creative prompt as the positive prompt.
- **How it works:**
- Copies the content of the `llmResponseTextarea` into the `positivePrompt` input field.
- Displays a notification confirming that the prompt has been applied.
- **Connected HTML Elements:**
- `positivePrompt`: Receives the LLM's response as its new value.
- `llmResponseTextarea`: Source of the creative prompt.
---
### **9. `form.addEventListener("submit", function (event) { ... })`**
**Description:**
Handles the image generation process when the form is submitted.
- **How it works:**
- Prevents the default form submission behavior.
- Disables the "Generate Image" button and starts the spinner and timer.
- Sends the form data (positive prompt, negative prompt, steps, width, height) to the server.
- Displays the generated image, updates the image history, and enables navigation buttons for image history.
- Resets the button and timer after the process is complete.
- **Connected HTML Elements:**
- `positivePrompt`, `negativePrompt`, `steps`, `width`, `height`: Input fields that send data to the server.
- `generatedImage`: Displays the generated image.
- `spinner`: Shows a loading spinner during the process.
- `buttonText`: Updates the button text to indicate image generation.
- `imageNavigation`: Shows the image navigation buttons after the image is generated.
---
### **10. `updateImageNavigation()`**
**Description:**
Updates the state of the previous and next buttons based on the current position in the image history.
- **How it works:**
- Enables/disables the "Previous" and "Next" buttons depending on whether there are previous/next images in the history.
- **Connected HTML Elements:**
- `prevImage` & `nextImage`: Navigation buttons to switch between generated images.
- `imageNavigation`: Shows or hides the entire navigation section.
---
### **11. `prevImageBtn.addEventListener("click", function () { ... })`**
**Description:**
Navigates to the previous image in the image history when the "Previous" button is clicked.
- **How it works:**
- Decreases the `currentImageIndex` and updates the `generatedImage` to display the previous image.
- **Connected HTML Elements:**
- `generatedImage`: Displays the previous image.
---
### **12. `nextImageBtn.addEventListener("click", function () { ... })`**
**Description:**
Navigates to the next image in the image history when the "Next" button is clicked.
- **How it works:**
- Increases the `currentImageIndex` and updates the `generatedImage` to display the next image.
- **Connected HTML Elements:**
- `generatedImage`: Displays the next image.