Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Bottlenecks in Persistent<T> #171

Open
songweijia opened this issue Jul 22, 2020 · 0 comments
Open

Performance Bottlenecks in Persistent<T> #171

songweijia opened this issue Jul 22, 2020 · 0 comments

Comments

@songweijia
Copy link
Contributor

There is a performance bottleneck in Persistent. I found it when I evaluate Persistent bandwidth performance with large operation size and fast NVMe devices (2GB/s peak write throughput). Compared with slow SSD (~500MB/s peak write throughput), fast NVMe only improved the Persistent bandwidth test performance marginally even with large message sizes (~100MB).

I found that Persistent::version() takes a long time to append data to the log entry. The current object of T is first serialized into a new allocated memory buffer and then appended to the log. This introduced extra overhead including allocating new memory buffer and memory copies from the memory buffer to memory mapped log data region. This overhead was acceptable with slow persistent devices. But for fast persistent device (2GB/s is only 1/3 ~ 1/4 of memory bandwidth), that overhead begins to dominate. Two optimizations can be done here:

  1. Acquire the memory regions from the persist log before hand and fill it, like what we do with ordered send buffer.
  2. Use hugepage to manage the memory buffers in PersistLog to reduce the memory allocation overhead in PersistLog::append().
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant