Pygwalker cannot render too much data #546

heqi201255 · 2024-05-13T07:03:50Z

I was trying to plot my data using Pygwalker, the data is a csv file about 467MB with shape (3682080, 12), my code is like:

from pygwalker.api.streamlit import StreamlitRenderer
import pandas as pd
import streamlit as st

# Adjust the width of the Streamlit page
st.set_page_config(
    page_title="Use Pygwalker In Streamlit",
    layout="wide"
)

# Add Title
st.title("Use Pygwalker In Streamlit")

# You should cache your pygwalker renderer, if you don't want your memory to explode
@st.cache_resource
def get_pyg_renderer() -> "StreamlitRenderer":
    df = pd.read_csv("/data.csv")
    # If you want to use feature of saving chart config, set `spec_io_mode="rw"`
    return StreamlitRenderer(df, kernel_computation=True)


renderer = get_pyg_renderer()

renderer.explorer()

I tried to use pygwalker inside jupyter and via streamlit, both gave me the error "The query returned too many data entries, making it difficult for the frontend to render. Please adjust your chart configuration and try again."

Screenshot:

The visualization is stuck at loading, and got a timeout message afterwards. Is there any workaround to render my data? What chart configuration should I adjust?

longxiaofei · 2024-05-13T07:14:25Z

Hi @heqi201255

Thank you for bringing up this issue with pygwalker. By default, pygwalker has a fixed limitation on data queries to ensure the safety of memory usage in the frontend browser.

When the count(distinct t) exceeds 1,000,000 (1 million), it becomes challenging for the frontend to efficiently render such a large amount of data into a chart.

To address this issue, we are considering adding a new parameter that allows users to control the maximum data size for rendering. This parameter will provide flexibility and allow users to adjust the size according to their specific needs.

One possible solution is to introduce the following code snippet, which sets the maximum data length to 10,000,000 (10 million):

pyg.GlobalVarManager.set_max_data_length(10 * 1000 * 1000)

We would appreciate your thoughts and feedback on this proposed solution. Please let us know if you have any suggestions or concerns.

Kanaries deleted a comment from longxiaofei May 13, 2024

This was referenced May 13, 2024

feat: add global param max_data_length #548

Closed

feat: add global param max_data_length #549

Closed

feat: add global param max_data_length #550

Merged

ObservedObserver closed this as completed Jun 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pygwalker cannot render too much data #546

Pygwalker cannot render too much data #546

heqi201255 commented May 13, 2024 •

edited

longxiaofei commented May 13, 2024 •

edited by ObservedObserver

Pygwalker cannot render too much data #546

Pygwalker cannot render too much data #546

Comments

heqi201255 commented May 13, 2024 • edited

longxiaofei commented May 13, 2024 • edited by ObservedObserver

heqi201255 commented May 13, 2024 •

edited

longxiaofei commented May 13, 2024 •

edited by ObservedObserver