-
-
Notifications
You must be signed in to change notification settings - Fork 17.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: concatenating non overlapping time series with non ns unit leads to dataframe with missing data #58471
Comments
I am not sure if it is only related to the unit. Here is an example with Also notice that in both outputs the Reproduceable Example
|
This is happening for me even with import pandas as pd
idx1 = pd.date_range("2013-08-16", "2024-05-01", freq="D", unit="us")
idx2 = pd.date_range("2015-09-19", "2024-05-01", freq="D", unit="us")
ts1 = pd.Series(range(len(idx1)), index=idx1)
ts2 = pd.Series(range(len(idx2)), index=idx2)
df1 = pd.concat([ts1, ts2], axis=1)
print(df1.index) Output:
Note how it loses frequency and also adds bogus dates at the end! Really scary bug, had me scratching my head for a long time.
|
Can confirm this is still an issue on the latest commit in the master branch. @jorgehv you may want to update your original post to tick the box? import pandas as pd
print(pd.__version__)
idx1 = pd.date_range("2013-08-16", "2024-05-01", freq="D", unit="us")
idx2 = pd.date_range("2015-09-19", "2024-05-01", freq="D", unit="us")
ts1 = pd.Series(range(len(idx1)), index=idx1)
ts2 = pd.Series(range(len(idx2)), index=idx2)
df1 = pd.concat([ts1, ts2], axis=1)
print(df1.index) Output:
|
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
When trying to concatenate by column few non overlapping timeseries dataframes, if the units of the original dataframes are not 'ns' then the resulting dataframe will have missing data (and lose it's frequency value, in case it's relevant).
The example given has 3 dataframes and for some reason the result has missed most of the data from the first and the second dataframe. In case of concatenating 2 dataframes we end up with almost no data at all.
If we set the units to 'ns' everything works as expected, the resulting df has all the data and kept its frequency='5min'. Every other unit I tried failed with similar results than the example.
Expected Behavior
The text was updated successfully, but these errors were encountered: