Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugs occurred in other datasets #2

Open
Cloudcatcher888 opened this issue Sep 18, 2019 · 1 comment
Open

Bugs occurred in other datasets #2

Cloudcatcher888 opened this issue Sep 18, 2019 · 1 comment

Comments

@Cloudcatcher888
Copy link

Cloudcatcher888 commented Sep 18, 2019

I get a bug in T-mall datasets called:

(base) wzk@ddst:~/work/Sets2Sets$ python Sets2Sets.py ./data/alibaba_history.csv ./data/alibaba_future.csv 1 2 1
start dictionary generation...
{'MATERIAL_NUMBER': 9531}
# dimensions of final vector: 9531 | 2962
finish dictionary generation*****
num of vectors having entries more than 1: 16462
num of vectors having entries more than 1: 15275
Traceback (most recent call last):
  File "Sets2Sets.py", line 990, in <module>
    main(sys.argv)
  File "Sets2Sets.py", line 955, in main
    codes_freq = get_codes_frequency_no_vector(data_chunk[past_chunk],input_size,data_chunk[future_chunk].keys())
  File "Sets2Sets.py", line 935, in get_codes_frequency_no_vector
    for idx in X[pid]:
KeyError: '371250'

Have anyone met this before? I'd be really appreciated if anyone can help.

@HaojiHu
Copy link
Owner

HaojiHu commented Sep 18, 2019

The variable pid goes out of the bound of X. Try to make sure there is a pid 371250 in the data_chunk[past_chunk]. I had some preprocess to make sure the keys in data_chunk[future_chunk].keys() are contained in data_chunk[past_chunk].keys(). Thanks for reporting this. I will try to fix it later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants