Unexpected performance of hash maps #785

jannisteunissen · 2024-04-02T19:34:49Z

Today I started exploring the stdlib (which I think is a great effort!) and I did a performance test of the hash map implementation. The test I did is rather simple: generate keys (with duplicates), and then add a key if it is not yet present, and remove a key if it is already present. The code is copied below. I compared performance against another Fortran hash map implementation I once made (https://github.com/jannisteunissen/ffhash), which was based on khash from https://github.com/attractivechaos/klib

With standard compile flags (-O2) and a test size of 5 million, on an Intel i7-1185G7 laptop CPU, I got the following numbers:

open_hashmap_type from stdlib, using fnv_1_hasher:
Elapsed time (s) 0.2749E+01
Entries/s 0.1819E+07

while the other code (ffhash, murmur3 hasher)

Elapsed time (s) 0.2541E+00
Entries/s 0.1968E+08

A factor 10 seems a bit much. I would expect some overhead due to the class-type interface, but not this much. Perhaps this benchmark can be useful in improving the performance.

program open_hashmap_benchmark
  use stdlib_kinds, only: dp, int8, int32
  use stdlib_hashmaps, only : open_hashmap_type
  use stdlib_hashmap_wrappers

  implicit none

  type(open_hashmap_type)     :: map
  integer, parameter          :: n_max = 5*1000*1000
  integer                     :: n
  integer(int32), allocatable :: keys(:)
  integer, allocatable        :: key_counts(:)
  real(dp)                    :: t_start, t_end
  real(dp), allocatable       :: r_uniform(:)
  type(key_type)              :: key
  logical                     :: present
  integer(int8)               :: int32_as_int8(4)

  call map%init(fnv_1_hasher, slots_bits=10)

  allocate(keys(n_max), r_uniform(n_max))

  call random_number(r_uniform)
  keys = nint(r_uniform * n_max * 0.25_dp)

  call cpu_time(t_start)
  do n = 1, n_max
     int32_as_int8 = transfer(keys(n), int32_as_int8)
     call set(key, int32_as_int8)

     call map%key_test(key, present)

     if (present) then
        call map%remove(key)
     else
        call map%map_entry(key)
     end if
  end do
  call cpu_time(t_end)

  write(*, "(A,E12.4)") "Elapsed time (s) ", t_end - t_start
  write(*, "(A,E12.4)") "Entries/s        ", n_max/(t_end - t_start)
  write(*, "(A,I12)")   "n_occupied       ", map%entries()
  write(*, "(A,I12)")   "n_buckets        ", map%num_slots()

  ! Count number of keys that occur an odd number of times
  allocate(key_counts(minval(keys):maxval(keys)))
  key_counts = 0
  do n = 1, n_max
     key_counts(keys(n)) = key_counts(keys(n)) + 1
  end do
  n = sum(iand(key_counts, 1))

  if (n /= map%entries()) then
     error stop "FAILED"
  else
     print *, "PASSED"
  end if

  ! Clean up allocated storage
  deallocate(key_counts)

end program open_hashmap_benchmarka

The benchmark I compared against is https://github.com/jannisteunissen/ffhash/blob/master/example_benchmark.f90

jvdp1 · 2024-04-02T20:42:40Z

Thank you @jannisteunissen for this report. I tested your code on my machine, and I got similar results with gfortran and the fpm options used with the profile release. I tested other hashers and got similar times.

gareth-nx · 2024-04-03T09:24:15Z

@wclodius2

gareth-nx · 2024-04-04T11:15:24Z

I have limited knowledge of hash maps, but am puzzled by this small modification of @jannisteunissen's program.

If replace the open_hashmap_type with a chaining_hashmap_type then the speed is significantly faster, but the program fails at the error stop.

From inspection, it seems that for chaining_hashmap_type a call to map%remove() does not always lead to map%entries() being reduced by 1. Whereas that does happen for the open_hashmap_type.

See commented-out print statements around map%remove.
The subroutine called by map%remove is here for the open hashmap and here for the chaining hashmap

Does anyone know if this is expected? @wclodius2

program chaining_hashmap_benchmark
  use stdlib_kinds, only: dp, int8, int32
  use stdlib_hashmaps, only : open_hashmap_type, chaining_hashmap_type
  use stdlib_hashmap_wrappers

  implicit none

  !! Choice of hashmap -- should these do the same thing? 
  !type(open_hashmap_type)     :: map 
  type(chaining_hashmap_type)     :: map

  integer, parameter          :: n_max = 5*1000*1000
  integer                     :: n
  integer(int32), allocatable :: keys(:)
  integer, allocatable        :: key_counts(:)
  real(dp)                    :: t_start, t_end
  real(dp), allocatable       :: r_uniform(:)
  type(key_type)              :: key
  logical                     :: present
  integer(int8)               :: int32_as_int8(4)

  call map%init(fnv_1_hasher, slots_bits=10)

  allocate(keys(n_max), r_uniform(n_max))

  call random_number(r_uniform)
  keys = nint(r_uniform * n_max * 0.25_dp)

  call cpu_time(t_start)
  do n = 1, n_max
     int32_as_int8 = transfer(keys(n), int32_as_int8)
     call set(key, int32_as_int8)

     call map%key_test(key, present)

     if (present) then
        !print*, map%entries()
        call map%remove(key)
        !print*, map%entries() ! Did this reduce?
     else
        call map%map_entry(key)
     end if
  end do
  call cpu_time(t_end)

  write(*, "(A,E12.4)") "Elapsed time (s) ", t_end - t_start
  write(*, "(A,E12.4)") "Entries/s        ", n_max/(t_end - t_start)
  write(*, "(A,I12)")   "n_occupied       ", map%entries()
  write(*, "(A,I12)")   "n_buckets        ", map%num_slots()

  ! Count number of keys that occur an odd number of times
  allocate(key_counts(minval(keys):maxval(keys)))
  key_counts = 0
  do n = 1, n_max
     key_counts(keys(n)) = key_counts(keys(n)) + 1
  end do
  n = sum(iand(key_counts, 1))

  if (n /= map%entries()) then
     error stop "FAILED"
  else
     print *, "PASSED"
  end if

  ! Clean up allocated storage
  deallocate(key_counts)

end program chaining_hashmap_benchmark

jvdp1 · 2024-04-04T11:40:17Z

@gareth-nx I spotted the same issue, and plan to submit a PR soon to fix it.

gareth-nx · 2024-04-04T11:44:09Z

@jvdp1 I was just looking at this, and found that if we append the line

map % num_entries = map % num_entries - 1

right at the end of remove_chaining_entry, then the test passes, while the speed remains much better than for open_hashmap_type.

I guess we need to add some additional tests of the hashmaps!

jannisteunissen · 2024-04-04T21:09:49Z

Glad to see that this test was also useful in other ways! I'd be happy to contribute it to to the test suite. Let me also mention that the code I compared with used open addressing (and quadratic probing), so in principle better performance should be possible for open_hashmap_type.

wclodius2 · 2024-04-10T16:10:46Z

I have been very busy lately, and am far behind in responding to my emails. I apologize for only responding at this late a date. I had stopped participating in the stdlib effort, because other aspects of my life made it impractical to participate to the extent that I could still be useful.It looks like you have solved the problem, so my participation is not necessary. If you would find my participation useful I will see what I can do, but I am on vacation and don’t have my full resources, I am busy with the trip, and it will take time to come fully up to speed. On Apr 4, 2024, at 4:10 PM, Jannis Teunissen ***@***.***> wrote: Glad to see that this test was also useful in other ways! I'd be happy to contribute it to to the test suite. Let me also mention that the code I compared with used open addressing (and quadratic probing), so in principle better performance should be possible for open_hashmap_type. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>

gareth-nx · 2024-04-10T23:37:16Z

No problem at all @wclodius2 -- enjoy your vacation -- and please don't feel obliged to respond unless you feel like it.

gareth-nx mentioned this issue Apr 6, 2024

Update to include int32 hashmap keytype #778

Merged

jvdp1 mentioned this issue Apr 7, 2024

Fix in the procedure remove of chaining hashmaps #788

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpected performance of hash maps #785

Unexpected performance of hash maps #785

jannisteunissen commented Apr 2, 2024

jvdp1 commented Apr 2, 2024

gareth-nx commented Apr 3, 2024

gareth-nx commented Apr 4, 2024 •

edited

jvdp1 commented Apr 4, 2024 •

edited

gareth-nx commented Apr 4, 2024 •

edited

jannisteunissen commented Apr 4, 2024

wclodius2 commented Apr 10, 2024 via email

gareth-nx commented Apr 10, 2024

Unexpected performance of hash maps #785

Unexpected performance of hash maps #785

Comments

jannisteunissen commented Apr 2, 2024

jvdp1 commented Apr 2, 2024

gareth-nx commented Apr 3, 2024

gareth-nx commented Apr 4, 2024 • edited

jvdp1 commented Apr 4, 2024 • edited

gareth-nx commented Apr 4, 2024 • edited

jannisteunissen commented Apr 4, 2024

wclodius2 commented Apr 10, 2024 via email

gareth-nx commented Apr 10, 2024

gareth-nx commented Apr 4, 2024 •

edited

jvdp1 commented Apr 4, 2024 •

edited

gareth-nx commented Apr 4, 2024 •

edited