Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MemoryError while loading graph #71

Open
GiulioRossetti opened this issue Jun 5, 2020 · 14 comments
Open

MemoryError while loading graph #71

GiulioRossetti opened this issue Jun 5, 2020 · 14 comments

Comments

@GiulioRossetti
Copy link

If I try to load a relatively small network (~86980 edges) composed of three snapshots I get the following error:

MemoryError: Unable to allocate array with shape (110011, 110011, 3) and data type float64

I have the same behavior loading an edgelist as well as a pandas DataFrame.

Is it a known scalability issue or is there something that I am missing?
If so, what's the network size that teneto is able to handle?

@wiheto
Copy link
Owner

wiheto commented Jun 5, 2020

Unfortunately we have some speed/size issues that I just havn't had time to address as this is more of a hobby project at the moment. Most of the functions are designed to work with 1000x1000x1000 size networks.

I am assuming you are trying to use the TemporalNetwork class?

If yes, you can try the hdf5 flag. You may hit a MemoryError a little later, depending on what you try and do as not everything is hdf5 optimized. Or you may hit some speed issues. But, hopefully, that will work.

@GiulioRossetti
Copy link
Author

Thanks for your quick reply!
Yes I'm using the TemporalNetwork class: I'll try with the hdf5 flag, thank for the suggestion!

@wiheto
Copy link
Owner

wiheto commented Jun 5, 2020

No problem. Let me know how it goes.

If not, I will make sure we get that networksize covered when I get around to rewriting the core of teneto (but finding time is the problem)

@GiulioRossetti
Copy link
Author

Unfortunately, setting the flag has no effect.

@wiheto
Copy link
Owner

wiheto commented Jun 5, 2020

Could you explain a little more (just for my understanding). Did it fail to load the network or run a function after it was loaded? If the latter, which function.

@GiulioRossetti
Copy link
Author

Still at loading stage.

@wiheto
Copy link
Owner

wiheto commented Jun 5, 2020

And the network is dense (i.e. few edges are 0)?

@GiulioRossetti
Copy link
Author

The network is not particularly dense ~87k directed edges for 8k nodes: approx density of the static graph 0.001

@wiheto
Copy link
Owner

wiheto commented Jun 5, 2020

Thanks. Could you send the complete error message (i.e. the ca 10-20 lines above the MemoryError). It will help me isolate the memory hogging process.

Sorry about this.

@GiulioRossetti
Copy link
Author

Don't worry, I'll send you the whole stacktrace first thing tomorrow morning!

@GiulioRossetti
Copy link
Author

Here it is:

---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
<ipython-input-76-bd3519b38063> in <module>
----> 1 tnet = TemporalNetwork(from_edgelist=edges, hdf5=True, hdf5path="data/", diagonal=False)
      2 tnet.network

~/anaconda3/lib/python3.7/site-packages/teneto/classes/network.py in __init__(self, N, T, nettype, from_df, from_array, from_dict, from_edgelist, timetype, diagonal, timeunit, desc, starttime, nodelabels, timelabels, hdf5, hdf5path, forcesparse)
    139             self.network_from_df(from_df)
    140         if from_edgelist is not None:
--> 141             self.network_from_edgelist(from_edgelist)
    142         elif from_array is not None:
    143             self.network_from_array(from_array, forcesparse=forcesparse)

~/anaconda3/lib/python3.7/site-packages/teneto/classes/network.py in network_from_edgelist(self, edgelist)
    264             colnames = ['i', 'j', 't']
    265         self.network = pd.DataFrame(edgelist, columns=colnames)
--> 266         self._update_network()
    267 
    268     def network_from_dict(self, contact):

~/anaconda3/lib/python3.7/site-packages/teneto/classes/network.py in _update_network(self)
    226         """Helper function that updates the network info"""
    227         self._calc_netshape()
--> 228         self._set_nettype()
    229         if self.nettype:
    230             if self.nettype[1] == 'u':

~/anaconda3/lib/python3.7/site-packages/teneto/classes/network.py in _set_nettype(self)
    179             self.nettype = 'xu'
    180             G1 = teneto.utils.df_to_array(
--> 181                 self.network, self.netshape, self.nettype)
    182             self.nettype = 'xd'
    183             G2 = teneto.utils.df_to_array(

~/anaconda3/lib/python3.7/site-packages/teneto/utils/utils.py in df_to_array(df, netshape, nettype)
    764     if len(df) > 0:
    765         idx = np.array(list(map(list, df.values)))
--> 766         tnet = np.zeros([netshape[0], netshape[0], netshape[1]])
    767         if idx.shape[1] == 3:
    768             if nettype[-1] == 'u':

MemoryError: Unable to allocate array with shape (110011, 110011, 2) and data type float64

@wiheto
Copy link
Owner

wiheto commented Jun 6, 2020 via email

@GiulioRossetti
Copy link
Author

Tried: same stacktrace.

@wiheto
Copy link
Owner

wiheto commented Jun 6, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants