proposal: bytes, strings: add CutByte #67101

aimuz · 2024-04-29T07:41:59Z

Proposal Details

Abstract

This proposal suggests the addition of a new function, CutByte, to the strings and bytes packages in the Go standard library. The function aims to simplify the handling of strings and byte slices by cutting them around the first instance of a specified byte separator. This function is designed to offer a more efficient alternative to the existing Cut function when the separator is a single byte, providing up to a 25% performance improvement in typical use cases.

Background

The existing Cut function in the Go standard library has proven to be extremely useful for handling strings and byte slices by simplifying code and replacing multiple standard library functions. However, in scenarios where the separator is known to be a single byte, the Cut function can be optimized further. The proposed CutByte function addresses this by focusing on byte-level operations, which are common in many real-world applications, such as parsing binary protocols or handling ASCII-based text formats.

Details

./cmd/vet/vet_test.go:			if _, suffix, ok := strings.Cut(text, " "); ok {
./cmd/go/scriptreadme_test.go:	_, lang, ok := strings.Cut(doc.String(), "# Script Language\n\n")
./cmd/go/scriptreadme_test.go:	lang, _, ok = strings.Cut(lang, "\n\nvar ")
./cmd/go/internal/test/test.go:		op, name, found := strings.Cut(s, " ")
./cmd/go/internal/vcweb/script.go:			mPath, version, ok := strings.Cut(args[1], "@")
./cmd/go/internal/modfetch/fetch.go:				_, vers, _ = strings.Cut(vers, "-")
./cmd/go/internal/modfetch/fetch.go:					goos, goarch, _ := strings.Cut(vers[i+1:], "-")
./cmd/go/internal/modfetch/codehost/vcs.go:		before, after, found := strings.Cut(line, ":")
./cmd/go/internal/modfetch/toolchain.go:	_, v, ok := strings.Cut(v, "-")
./cmd/go/internal/modload/list.go:		if path, vers, found := strings.Cut(arg, "@"); found {
./cmd/go/internal/modload/list.go:		if path, vers, found := strings.Cut(arg, "@"); found {
./cmd/go/internal/modload/build.go:	if path, vers, found := strings.Cut(path, "@"); found {
./cmd/go/internal/modload/search.go:			elem, rest, found := strings.Cut(p, string(filepath.Separator))
./cmd/go/internal/modload/init.go:	line, _, _ := strings.Cut(string(buf[:n]), "\n")
./cmd/go/internal/envcmd/env.go:		key, val, found := strings.Cut(arg, "=")
./cmd/go/internal/modindex/build.go:		line, argstr, ok := strings.Cut(strings.TrimSpace(line[4:]), ":")
./cmd/go/internal/modindex/build.go:	name, _, _ = strings.Cut(name, ".")
./cmd/go/internal/modindex/build_read.go:			path, _, ok = strings.Cut(args[1:], "`")
./cmd/go/internal/script/engine.go:		prefix, suffix, ok := strings.Cut(cond.tag, ":")
./cmd/go/internal/script/engine.go:		if prefix, suffix, ok := strings.Cut(tag, ":"); ok {
./cmd/go/internal/script/state.go:		if k, v, ok := strings.Cut(kv, "="); ok {
./cmd/go/internal/modcmd/edit.go:	before, after, found := strings.Cut(arg, "@")
./cmd/go/internal/modcmd/edit.go:	before, after, found := strings.Cut(arg, "@")
./cmd/go/internal/modcmd/edit.go:	before, after, found := strings.Cut(s, ",")
./cmd/go/internal/modcmd/edit.go:	before, after, found := strings.Cut(arg, "=")
./cmd/go/internal/load/godebug.go:	k, v, ok := strings.Cut(strings.TrimSpace(text[i:]), "=")
./cmd/go/internal/vcs/vcs.go:		pattern, list, found := strings.Cut(item, ":")
./cmd/go/internal/vcs/vcs.go:	if scheme, path, ok := strings.Cut(repo, "://"); ok {
./cmd/go/internal/work/gccgo.go:		before, after, _ := strings.Cut(args, "=")
./cmd/go/internal/gover/gomod.go:	s, _, _ = strings.Cut(s, "//") // strip comments
./cmd/go/internal/gover/mod_test.go:		path, vers, _ := strings.Cut(f, " ")
./cmd/go/internal/modget/query.go:	pattern, rawVers, found := strings.Cut(raw, "@")
./cmd/go/internal/cmdflag/flag.go:	name, value, hasValue := strings.Cut(arg[1:], "=")
./cmd/go/internal/workcmd/edit.go:	before, after, found := strings.Cut(arg, "@")
./cmd/go/internal/workcmd/edit.go:	before, after, found := strings.Cut(arg, "=")
./cmd/go/internal/toolchain/select.go:		min, suffix, plus := strings.Cut(gotoolchain, "+") // go1.2.3+auto
./cmd/go/internal/toolchain/select.go:		name, val, hasEq := strings.Cut(a, "=")
./cmd/go/internal/toolchain/select.go:				name, _, _ := strings.Cut(a, "=")
./cmd/go/internal/toolchain/select.go:	path, version, _ := strings.Cut(pkgArg, "@")
./cmd/go/main.go:		_, dir, _ = strings.Cut(a, "=")
./cmd/distpack/pack.go:	version, rest, _ := strings.Cut(string(data), "\n")
./cmd/internal/testdir/testdir_test.go:		goos, goarch, ok := strings.Cut(*target, "/")
./cmd/internal/testdir/testdir_test.go:		line, actionSrc, _ = strings.Cut(actionSrc, "\n")
./cmd/internal/testdir/testdir_test.go:	header, _, ok := strings.Cut(src, "\npackage")
./cmd/internal/testdir/testdir_test.go:			if _, suffix, ok := strings.Cut(text, " "); ok {
./cmd/internal/testdir/testdir_test.go:		lines[i], _, _ = strings.Cut(lines[i], " // ERROR ")
./cmd/internal/testdir/testdir_test.go:		errFile, rest, ok := strings.Cut(errStr, ":")
./cmd/internal/testdir/testdir_test.go:		lineStr, msg, ok := strings.Cut(rest, ":")
./cmd/compile/internal/base/flag.go:		verb, args, found := strings.Cut(line, " ")
./cmd/compile/internal/base/flag.go:		before, after, hasEq := strings.Cut(args, "=")
./cmd/link/internal/ld/ld.go:		verb, args, found := strings.Cut(line, " ")
./cmd/link/internal/ld/ld.go:		before, after, exist := strings.Cut(args, "=")
./cmd/link/internal/ld/go.go:		line, data, _ = strings.Cut(data, "\n")
./cmd/link/internal/ld/go.go:			if before, after, found := strings.Cut(remote, "#"); found {
./cmd/cgo/internal/testcshared/cshared_test.go:				switch prefix, _, _ := strings.Cut(name, "-"); prefix {
./cmd/fix/typecheck.go:				if _, elem, ok := strings.Cut(t, "]"); ok {
./cmd/fix/typecheck.go:				if _, et, ok := strings.Cut(t, "]"); ok {
./cmd/fix/typecheck.go:				if kt, vt, ok := strings.Cut(t[len("map["):], "]"); ok {
./cmd/fix/typecheck.go:				_, value, _ = strings.Cut(t, "]")
./cmd/fix/typecheck.go:				if k, v, ok := strings.Cut(t[len("map["):], "]"); ok {
./cmd/api/main_test.go:			feature, approval, ok := strings.Cut(line, "#")
./cmd/doc/dirs.go:		path, dir, _ := strings.Cut(line, "\t")
./cmd/vendor/golang.org/x/tools/cmd/bisect/main.go:		if _, v, _ := strings.Cut(e, "="); strings.Contains(v, "PATTERN") {
./cmd/vendor/golang.org/x/tools/cmd/bisect/main.go:		k, v, _ := strings.Cut(x, "=")
./cmd/vendor/golang.org/x/tools/cmd/bisect/main.go:		line, all, _ = strings.Cut(all, "\n")
./cmd/vendor/golang.org/x/tools/go/analysis/passes/directive/directive.go:		line, text, _ = strings.Cut(text, "\n")
./cmd/vendor/golang.org/x/tools/go/analysis/passes/directive/directive.go:				_, line, ok = strings.Cut(line, "*/")
./cmd/vendor/golang.org/x/tools/internal/versions/versions.go:	v, _, _ = strings.Cut(v, "-") // strip -bigcorp suffix.
./cmd/vendor/golang.org/x/tools/internal/stdlib/stdlib.go:	typename, name, _ = strings.Cut(sym.Name, ".")
./cmd/vendor/golang.org/x/tools/internal/stdlib/stdlib.go:	recv, name, _ = strings.Cut(sym.Name, ".")
./cmd/vendor/golang.org/x/telemetry/internal/crashmonitor/monitor.go:		_, pcstr, ok := strings.Cut(line, " pc=") // e.g. pc=0x%x
./cmd/vendor/golang.org/x/telemetry/internal/config/config.go:			prefix, _, found := strings.Cut(c.Name, ":")
./cmd/vendor/golang.org/x/telemetry/internal/config/config.go:	prefix, rest, hasBuckets := strings.Cut(counter, "{")
./cmd/vendor/golang.org/x/telemetry/internal/counter/parse.go:		k, v, ok := strings.Cut(line, ": ")
./cmd/vendor/golang.org/x/telemetry/internal/upload/reports.go:				before, _, _ := strings.Cut(k, "\n")
./cmd/vendor/golang.org/x/build/relnote/relnote.go:	dir, rest, _ := strings.Cut(filename, "/")
./cmd/vendor/golang.org/x/build/relnote/relnote.go:	dir, rest, _ = strings.Cut(rest, "/")
./crypto/ecdsa/ecdsa_test.go:			curve, hash, _ := strings.Cut(line, ",")
./crypto/x509/pem_decrypt.go:	mode, hexIV, ok := strings.Cut(dek, ",")
./crypto/x509/verify_test.go:	before, _, ok := strings.Cut(string(out), ".")
./crypto/tls/handshake_test.go:		_, after, ok := strings.Cut(line, " ")
./crypto/tls/handshake_test.go:		before, _, ok := strings.Cut(line, "|")
./strconv/fp_test.go:	if mant, exp, ok := strings.Cut(s, "p"); ok {
./strconv/fp_test.go:	if mant, exp, ok := strings.Cut(s, "p"); ok {
./strings/example_test.go:		before, after, found := strings.Cut(s, sep)
./net/mail/message.go:		k, v, ok := strings.Cut(kv, ":")
./net/platform_test.go:	net, _, _ := strings.Cut(network, ":")
./net/platform_test.go:	switch net, _, _ := strings.Cut(network, ":"); net {
./net/platform_test.go:	switch net, _, _ := strings.Cut(network, ":"); net {
./net/url/url.go:	u, frag, _ := strings.Cut(rawURL, "#")
./net/url/url.go:		rest, url.RawQuery, _ = strings.Cut(rest, "?")
./net/url/url.go:		if segment, _, _ := strings.Cut(rest, "/"); strings.Contains(segment, ":") {
./net/url/url.go:		username, password, _ := strings.Cut(userinfo, ":")
./net/url/url.go:			if segment, _, _ := strings.Cut(path, "/"); strings.Contains(segment, ":") {
./net/url/url.go:		key, query, _ = strings.Cut(query, "&")
./net/url/url.go:		key, value, _ := strings.Cut(key, "=")
./net/url/url.go:		elem, remaining, found = strings.Cut(remaining, "/")
./net/main_posix_test.go:	net, _, _ := strings.Cut(network, ":")
./net/http/transport.go:			_, text, ok := strings.Cut(resp.Status, " ")
./net/http/response.go:	proto, status, ok := strings.Cut(line, " ")
./net/http/response.go:	statusCode, _, _ := strings.Cut(resp.Status, " ")
./net/http/request.go:	username, password, ok = strings.Cut(cs, ":")
./net/http/request.go:	method, rest, ok1 := strings.Cut(line, " ")
./net/http/request.go:	requestURI, proto, ok2 := strings.Cut(rest, " ")
./net/http/cgi/host_test.go:		k, v, ok := strings.Cut(trimmedLine, "=")
./net/http/cgi/host.go:		header, val, ok := strings.Cut(string(line), ":")
./net/http/cgi/child.go:		if k, v, ok := strings.Cut(kv, "="); ok {
./net/http/fs.go:		start, end, ok := strings.Cut(ra, "-")
./net/http/cookie.go:		name, value, found := strings.Cut(s, "=")
./net/http/cookie.go:	name, value, ok := strings.Cut(parts[0], "=")
./net/http/cookie.go:		attr, val, _ := strings.Cut(parts[i], "=")
./net/http/cookie.go:			part, line, _ = strings.Cut(line, ";")
./net/http/cookie.go:			name, val, _ := strings.Cut(part, "=")
./net/http/client_test.go:				first, rest, _ := strings.Cut(final, ",")
./net/http/main_test.go:		_, stack, _ := strings.Cut(g, "\n")
./net/smtp/smtp.go:			k, v, _ := strings.Cut(line, " ")
./net/main_test.go:		_, stack, _ := strings.Cut(s, "\n")
./go/types/eval_test.go:	before, after, _ := strings.Cut(s, sep)
./go/importer/importer_test.go:	compiler, target, _ := strings.Cut(export, ":")
./go/printer/comment.go:		line, text, _ = strings.Cut(text, "\n")
./go/printer/printer.go:	if p, _, ok := strings.Cut(prefix, "*"); ok {
./go/printer/printer.go:	before, _, _ := strings.Cut(last, closing) // closing always present
./go/constant/value_test.go:			if ns, ds, ok := strings.Cut(a[1], "/"); ok && kind == token.FLOAT {
./go/constant/value_test.go:	if as, bs, ok := strings.Cut(lit, "/"); ok {
./go/version/version.go:	v, _, _ = strings.Cut(v, "-") // strip -bigcorp suffix.
./go/doc/example_test.go:				name, kind, found := strings.Cut(sectionName, ".")
./go/doc/comment/text.go:			line, text, _ = strings.Cut(text, "\n")
./go/doc/comment/markdown.go:			line, md, _ = strings.Cut(md, "\n")
./go/doc/comment/print.go:			line, md, _ = strings.Cut(md, "\n")
./go/doc/comment/print.go:		line, rest, ok := strings.Cut(s, "\n")
./go/doc/comment/parse.go:		if _, b, ok = strings.Cut(b, "'"); !ok {
./go/doc/comment/parse.go:		if _, b, ok = strings.Cut(b, "."); !ok {
./go/doc/headscan.go:		inner, s, _ = strings.Cut(s[loc[1]:], html_endh)
./go/build/read_test.go:		beforeP, afterP, _ := strings.Cut(tt.in, "ℙ")
./go/build/read_test.go:		if beforeD, afterD, ok := strings.Cut(beforeP, "𝔻"); ok {
./go/build/build.go:		line, argstr, ok := strings.Cut(strings.TrimSpace(line[4:]), ":")
./go/build/build.go:	name, _, _ = strings.Cut(name, ".")
./go/build/vendor_test.go:			_, pkg, found = strings.Cut(fullPkg, "/vendor/")
./go/build/build_test.go:	errStr, _, _ = strings.Cut(errStr, ";")
./go/build/read.go:			path, _, ok = strings.Cut(args[1:], "`")
./go/build/constraint/vers.go:		_, v, _ := strings.Cut(z.Tag, "go1.")
./regexp/exec_test.go:				loStr, hiStr, _ := strings.Cut(pair, "-")
./regexp/exec_test.go:			if _, flag, ok = strings.Cut(flag[1:], ":"); !ok {
./regexp/regexp.go:		before, after, ok := strings.Cut(template, "$")
./regexp/syntax/parse.go:					lit, t, _ = strings.Cut(t[2:], `\E`)
./archive/tar/writer_test.go:		prefix, _, _ = strings.Cut(prefix, "\x00") // Truncate at the NUL terminator
./archive/tar/strconv.go:	ss, sn, _ := strings.Cut(s, ".")
./archive/tar/strconv.go:	nStr, rest, ok := strings.Cut(s, " ")
./archive/tar/strconv.go:	k, v, ok = strings.Cut(rec, "=")
./path/filepath/path.go:		part, p, _ = strings.Cut(p, "/")
./runtime/testdata/testprog/traceback_ancestors.go:				g, all, _ = strings.Cut(all, "\n\n")
./runtime/testdata/testprog/traceback_ancestors.go:					id, _, _ := strings.Cut(strings.TrimPrefix(g, "goroutine "), " ")
./runtime/pprof/proto_test.go:			in, out, ok := strings.Cut(tt, "->\n")
./runtime/pprof/pprof_test.go:		if _, s, ok = strings.Cut(s, t); !ok {
./runtime/pprof/pprof_test.go:	base, kv, ok := strings.Cut(spec, ";")
./runtime/pprof/pprof_test.go:	k, v, ok := strings.Cut(kv, "=")
./runtime/pprof/proto.go:		loStr, hiStr, ok := strings.Cut(string(addr), "-")
./runtime/trace_cgo_test.go:		prefix, tracePath, found := strings.Cut(got, ":")
./runtime/traceback_test.go:		_, tb, _ = strings.Cut(tb, "\n")
./runtime/traceback_test.go:		line, tb, _ = strings.Cut(tb, "\n")
./runtime/traceback_test.go:			funcName, args, found := strings.Cut(line, "(")
./runtime/traceback_system_test.go:		_, pcstr, ok := strings.Cut(line, " pc=") // e.g. pc=0x%x
./runtime/debug/mod.go:		line, data, ok = strings.Cut(data, newline)
./runtime/debug/mod.go:				key, rawValue, ok = strings.Cut(kv, "=")
./internal/testenv/testenv.go:				line, goMod, _ = strings.Cut(goMod, "\n")
./internal/testenv/testenv.go:			importPath, export, ok := strings.Cut(line, "=")
./internal/zstd/zstd_test.go:			want, _, _ := strings.Cut(name, ".")
./encoding/asn1/common.go:		part, str, _ = strings.Cut(str, ",")
./encoding/xml/typeinfo.go:	if ns, t, ok := strings.Cut(tag, " "); ok {
./encoding/xml/xml.go:	} else if space, local, ok := strings.Cut(s, ":"); !ok || space == "" || local == "" {
./encoding/json/tags.go:	tag, opt, _ := strings.Cut(tag, ",")
./encoding/json/tags.go:		name, s, _ = strings.Cut(s, ",")
./html/template/js.go:	mimeType, _, _ = strings.Cut(mimeType, ";")
./html/template/url.go:	if protocol, _, ok := strings.Cut(s, ":"); ok && !strings.Contains(protocol, "/") {
./html/template/attr.go:	} else if prefix, short, ok := strings.Cut(name, ":"); ok {
./testing/testing_test.go:				if name, _, ok := strings.Cut(trimmed, " "); ok {
./log/slog/slogtest_test.go:		kv, rest, _ := strings.Cut(s, " ") // assumes exactly one space between attrs
./log/slog/slogtest_test.go:		k, value, found := strings.Cut(kv, "=")
./mime/mediatype.go:	if major, sub, ok := strings.Cut(t, "/"); !ok {
./mime/mediatype.go:	base, _, _ := strings.Cut(v, ";")
./mime/mediatype.go:		if baseName, _, ok := strings.Cut(key, "*"); ok {
./mime/encodedword.go:	charset, text, _ := strings.Cut(word, "?")
./mime/encodedword.go:	encoding, text, _ := strings.Cut(text, "?")
./os/user/lookup_unix.go:		u.Name, _, _ = strings.Cut(u.Name, ",")
./os/user/cgo_lookup_unix.go:	u.Name, _, _ = strings.Cut(u.Name, ",")
./os/os_test.go:		host, _, ok := strings.Cut(hostname, ".")
./os/exec/exec_windows_test.go:		k, _, ok := strings.Cut(kv, "=")
./os/exec/exec_test.go:	errLine, body, ok := strings.Cut(string(bs), "\n")
./os/exec/exec.go:		k, _, ok := strings.Cut(kv, "=")
./text/template/option.go:	if key, value, ok := strings.Cut(opt, "="); ok {

Proposal

We propose adding the following functions:

For the `strings` package:

// CutByte slices s around the first instance of sep,
// returning the text before and after sep.
// The found result reports whether sep appears in s.
// If sep does not appear in s, CutByte returns s, "", false.
func CutByte(s string, sep byte) (before, after string, found bool) {
    if i := IndexByte(s, sep); i >= 0 {
        return s[:i], s[i+1:], true
    }
    return s, "", false
}

For the `bytes` package:

// CutByte slices s around the first instance of sep,
// returning the text before and after sep.
// The found result reports whether sep appears in s.
// If sep does not appear in s, CutByte returns s, nil, false.
// CutByte returns slices of the original slice s, not copies.
func CutByte(s []byte, sep byte) (before, after []byte, found bool) {
    if i := IndexByte(s, sep); i >= 0 {
        return s[:i], s[i+1:], true
    }
    return s, nil, false
}

Rationale

The CutByte function is specifically optimized for cases where the separator is a single byte. In the analysis of Go's main repository and several large-scale Go projects, a significant number of string manipulations involve single-byte separators. The performance benefit of CutByte over Cut for these cases is approximately 25%, as measured in benchmarks comparing the two functions under similar conditions.

Compatibility

CutByte is a new addition and does not modify any existing interfaces or behavior in the Go standard library. It follows the established patterns and naming conventions of the Go ecosystem, ensuring that it integrates seamlessly with the existing library functions.

Implementation

The implementation of CutByte is straightforward and leverages the existing IndexByte function in both the strings and bytes packages. The proposed functions do not introduce any new dependencies or significant complexities.

Conclusion

Adding CutByte to the Go standard library will provide developers with a more efficient tool for handling common string and byte slice operations involving single-byte separators. This function not only enhances performance but also maintains readability and simplicity, aligning with Go's philosophy of clear and efficient coding practices.

The text was updated successfully, but these errors were encountered:

CutByte optimizes slicing operations for single-byte separators, offering a more efficient alternative when only a single byte is involved. There is more discussion on https://golang.org/issue/67101. Fixes golang#67101.

gopherbot · 2024-04-29T07:56:21Z

Change https://go.dev/cl/582176 mentions this issue: bytes, strings: add CutByte

adonovan · 2024-04-29T17:27:50Z

Can the performance of the existing function be improved without adding a new one? For example, how does this compare:

func Cut(s, sep []byte) (before, after []byte, found bool) {
	if len(sep) == 1 {
		if i := IndexByte(s, sep); i >= 0 {
			return s[:i], s[i+1:], true
		}
	}
	if i := Index(s, sep); i >= 0 {
		return s[:i], s[i+len(sep):], true
	}
	return s, nil, false
}

mateusz834 · 2024-04-29T17:43:44Z

@adonovan Index already has a optimization for one-byte strings

go/src/internal/stringslite/strings.go

Lines 25 to 31 in 16ce8b3

    
           func Index(s, substr string) int { 
        
           	n := len(substr) 
        
           	switch { 
        
           	case n == 0: 
        
           		return 0 
        
           	case n == 1: 
        
           		return IndexByte(s, substr[0])

ianlancetaylor · 2024-04-29T17:44:29Z

Please don't use images for plain text. Images are hard to read. Plain text is easy. Thanks.

ianlancetaylor · 2024-04-29T17:46:31Z

See #22148 for a relevant discussion.

adonovan · 2024-04-29T18:43:56Z

Index already has a optimization for one-byte strings

So it does. Are you saying that even with this fastpath, the overhead of evaluating len(substr), two switch conditions, and substr[0] makes it 25% slower? Could you furnish the benchmark you used?

aimuz · 2024-04-30T02:54:16Z

Please don't use images for plain text. Images are hard to read. Plain text is easy. Thanks.

I'm sorry, I've changed it to text, to avoid too much content, now you need to click on Details to see the content

Can the performance of the existing function be improved without adding a new one? For example, how does this compare:

func Cut(s, sep []byte) (before, after []byte, found bool) {
	if len(sep) == 1 {
		if i := IndexByte(s, sep); i >= 0 {
			return s[:i], s[i+1:], true
		}
	}
	if i := Index(s, sep); i >= 0 {
		return s[:i], s[i+len(sep):], true
	}
	return s, nil, false
}

Such code will not improve anything and will even degrade performance

Index already has a optimization for one-byte strings

So it does. Are you saying that even with this fastpath, the overhead of evaluating len(substr), two switch conditions, and substr[0] makes it 25% slower? Could you furnish the benchmark you used?

This is my benchmark code.

func BenchmarkCut(b *testing.B) {
	b.ReportAllocs()
	skip := 32
	s := Repeat(Repeat(" ", skip)+"a"+Repeat(" ", skip), 1<<16/skip)
	b.Run(fmt.Sprintf("Func=Cut-%d", skip), func(b *testing.B) {
		for i := 0; i < b.N; i++ {
			_, _, _ = Cut(s, "a")
		}
	})
	b.Run(fmt.Sprintf("Func=CutByte-%d", skip), func(b *testing.B) {
		for i := 0; i < b.N; i++ {
			_, _, _ = CutByte(s, 'a')
		}
	})
}

benchstat -col /Func  new.txt  
goos: darwin
goarch: arm64
pkg: strings
cpu: Apple M2 Max
       │   Cut-32    │             CutByte-32              │
       │   sec/op    │   sec/op     vs base                │
Cut-12   4.644n ± 3%   3.614n ± 1%  -22.18% (p=0.000 n=10)

       │   Cut-32   │           CutByte-32           │
       │    B/op    │    B/op     vs base            │
Cut-12   0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

       │   Cut-32   │           CutByte-32           │
       │ allocs/op  │ allocs/op   vs base            │
Cut-12   0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

I would prefer to use the following benchmarking procedure so that I can do benchmark comparisons on different samples. When I tried to run benchstat -col /Func new.txt I got unexpected results, so I switched to the benchmarking above, using a fixed sample of tests

func BenchmarkCut(b *testing.B) {
	b.ReportAllocs()

	for _, skip := range [...]int{2, 4, 8, 16, 32, 64} {
		s := Repeat(Repeat(" ", skip)+"a"+Repeat(" ", skip), 1<<16/skip)
		b.Run(fmt.Sprintf("Func=Cut-%d", skip), func(b *testing.B) {
			for i := 0; i < b.N; i++ {
				_, _, _ = Cut(s, "a")
			}
		})
		b.Run(fmt.Sprintf("Func=CutByte-%d", skip), func(b *testing.B) {
			for i := 0; i < b.N; i++ {
				_, _, _ = CutByte(s, 'a')
			}
		})
	}
}

See #22148 for a relevant discussion.

IndexByte gets a significant performance boost after CL 148578. So I don't think "Using strings.IndexByte is premature optimisation." is entirely applicable now.

aimuz · 2024-04-30T02:56:43Z

If I've missed any information, please clue me in, thanks!

egonelbre · 2024-04-30T08:40:05Z

I benchmarked this using https://github.com/egonelbre/exp/blob/main/bench/cutbyte/bench_test.go -- and then combining the results, I got the result (using Go master):

cpu: AMD Ryzen Threadripper 2950X 16-Core Processor
               │     Cut     │              CutByte               │
               │   sec/op    │   sec/op     vs base               │
Cut/skip=2-32    75.39n ± 2%   75.79n ± 2%       ~ (p=0.152 n=24)
Cut/skip=4-32    75.85n ± 1%   75.65n ± 1%       ~ (p=0.477 n=24)
Cut/skip=8-32    75.15n ± 2%   76.11n ± 1%       ~ (p=0.270 n=24)
Cut/skip=16-32   75.85n ± 1%   75.80n ± 2%       ~ (p=0.927 n=24)
Cut/skip=32-32   76.68n ± 1%   77.08n ± 3%       ~ (p=0.717 n=24)
Cut/skip=64-32   76.34n ± 1%   76.95n ± 2%       ~ (p=0.724 n=24)
geomean          75.88n        76.23n       +0.47%

Individual results:

cpu: AMD Ryzen Threadripper 2950X 16-Core Processor
                │     Cut     │              CutByte              │
                │   sec/op    │   sec/op     vs base              │
Cut0/skip=2-32    76.13n ± 1%   73.40n ± 3%  -3.59% (p=0.002 n=6)
Cut0/skip=4-32    75.76n ± 4%   72.25n ± 3%  -4.63% (p=0.002 n=6)
Cut0/skip=8-32    75.47n ± 3%   71.11n ± 3%  -5.78% (p=0.002 n=6)
Cut0/skip=16-32   76.63n ± 3%   70.83n ± 3%  -7.56% (p=0.002 n=6)
Cut0/skip=32-32   76.25n ± 4%   71.83n ± 4%  -5.81% (p=0.009 n=6)
Cut0/skip=64-32   75.91n ± 4%   72.01n ± 3%  -5.14% (p=0.002 n=6)
Cut1/skip=2-32    73.66n ± 2%   75.65n ± 3%  +2.69% (p=0.015 n=6)
Cut1/skip=4-32    75.80n ± 4%   75.72n ± 1%       ~ (p=0.848 n=6)
Cut1/skip=8-32    74.35n ± 2%   76.17n ± 2%  +2.44% (p=0.015 n=6)
Cut1/skip=16-32   74.25n ± 2%   75.65n ± 1%       ~ (p=0.180 n=6)
Cut1/skip=32-32   75.48n ± 2%   76.36n ± 4%       ~ (p=0.485 n=6)
Cut1/skip=64-32   75.60n ± 2%   77.56n ± 3%       ~ (p=0.180 n=6)
Cut2/skip=2-32    73.91n ± 2%   77.81n ± 3%  +5.29% (p=0.002 n=6)
Cut2/skip=4-32    74.81n ± 4%   77.52n ± 4%       ~ (p=0.093 n=6)
Cut2/skip=8-32    74.51n ± 4%   77.62n ± 2%  +4.17% (p=0.004 n=6)
Cut2/skip=16-32   76.07n ± 2%   78.75n ± 2%  +3.52% (p=0.002 n=6)
Cut2/skip=32-32   76.93n ± 2%   78.31n ± 2%  +1.79% (p=0.041 n=6)
Cut2/skip=64-32   76.45n ± 2%   78.09n ± 1%  +2.14% (p=0.009 n=6)
Cut3/skip=2-32    76.83n ± 2%   77.17n ± 3%       ~ (p=0.699 n=6)
Cut3/skip=4-32    75.93n ± 2%   75.77n ± 5%       ~ (p=0.909 n=6)
Cut3/skip=8-32    76.61n ± 2%   76.11n ± 1%       ~ (p=0.937 n=6)
Cut3/skip=16-32   75.64n ± 2%   75.89n ± 2%       ~ (p=0.818 n=6)
Cut3/skip=32-32   76.84n ± 9%   78.55n ± 2%       ~ (p=0.589 n=6)
Cut3/skip=64-32   77.31n ± 3%   76.65n ± 2%       ~ (p=0.310 n=6)
geomean           75.71n        75.66n       -0.07%

I used multiple implementations of the same code, because microbenchmarks are prone to statistical noise due to code layout differences.

aimuz · 2024-04-30T11:38:00Z

Too slow in your tests, maybe a faulty benchmark?

https://github.com/egonelbre/exp/blob/main/bench/cutbyte/bench_test.go#L65

Can you accept it without a global variable?

diff --git a/bench/cutbyte/bench_test.go b/bench/cutbyte/bench_test.go
index f56b9ac..0489ac5 100644
--- a/bench/cutbyte/bench_test.go
+++ b/bench/cutbyte/bench_test.go
@@ -62,8 +62,6 @@ func CutByte3(s string, sep byte) (before, after string, found bool) {
 	return s, "", false
 }
 
-var x, y, z any
-
 func BenchmarkCut0(b *testing.B) {
 	b.ReportAllocs()
 
@@ -71,13 +69,15 @@ func BenchmarkCut0(b *testing.B) {
 		s := strings.Repeat(strings.Repeat(" ", skip)+"a"+strings.Repeat(" ", skip), 1<<16/skip)
 		b.Run(fmt.Sprintf("func=Cut/skip=%d", skip), func(b *testing.B) {
 			for i := 0; i < b.N; i++ {
-				x, y, z = Cut0(s, "a")
+				x, y, z := Cut0(s, "a")
+				_, _, _ = x, y, z
 			}
 		})
 
 		b.Run(fmt.Sprintf("func=CutByte/skip=%d", skip), func(b *testing.B) {
 			for i := 0; i < b.N; i++ {
-				x, y, z = CutByte0(s, 'a')
+				x, y, z := CutByte0(s, 'a')
+				_, _, _ = x, y, z
 			}
 		})
 	}
@@ -89,13 +89,15 @@ func BenchmarkCut1(b *testing.B) {
 		s := strings.Repeat(strings.Repeat(" ", skip)+"a"+strings.Repeat(" ", skip), 1<<16/skip)
 		b.Run(fmt.Sprintf("func=Cut/skip=%d", skip), func(b *testing.B) {
 			for i := 0; i < b.N; i++ {
-				x, y, z = Cut1(s, "a")
+				x, y, z := Cut1(s, "a")
+				_, _, _ = x, y, z
 			}
 		})
 
 		b.Run(fmt.Sprintf("func=CutByte/skip=%d", skip), func(b *testing.B) {
 			for i := 0; i < b.N; i++ {
-				x, y, z = CutByte1(s, 'a')
+				x, y, z := CutByte1(s, 'a')
+				_, _, _ = x, y, z
 			}
 		})
 	}
@@ -108,13 +110,15 @@ func BenchmarkCut2(b *testing.B) {
 		s := strings.Repeat(strings.Repeat(" ", skip)+"a"+strings.Repeat(" ", skip), 1<<16/skip)
 		b.Run(fmt.Sprintf("func=Cut/skip=%d", skip), func(b *testing.B) {
 			for i := 0; i < b.N; i++ {
-				x, y, z = Cut2(s, "a")
+				x, y, z := Cut2(s, "a")
+				_, _, _ = x, y, z
 			}
 		})
 
 		b.Run(fmt.Sprintf("func=CutByte/skip=%d", skip), func(b *testing.B) {
 			for i := 0; i < b.N; i++ {
-				x, y, z = CutByte2(s, 'a')
+				x, y, z := CutByte2(s, 'a')
+				_, _, _ = x, y, z
 			}
 		})
 	}
@@ -127,14 +131,16 @@ func BenchmarkCut3(b *testing.B) {
 		s := strings.Repeat(strings.Repeat(" ", skip)+"a"+strings.Repeat(" ", skip), 1<<16/skip)
 		b.Run(fmt.Sprintf("func=Cut/skip=%d", skip), func(b *testing.B) {
 			for i := 0; i < b.N; i++ {
-				x, y, z = Cut3(s, "a")
+				x, y, z := Cut3(s, "a")
+				_, _, _ = x, y, z
 			}
 		})
 
 		b.Run(fmt.Sprintf("func=CutByte/skip=%d", skip), func(b *testing.B) {
 			for i := 0; i < b.N; i++ {
-				x, y, z = CutByte3(s, 'a')
+				x, y, z := CutByte3(s, 'a')
+				_, _, _ = x, y, z
 			}
 		})
 	}
-}
\ No newline at end of file
+}

I modified your test case a bit and got the following results

goos: darwin
goarch: arm64
pkg: github.com/egonelbre/exp/bench/cutbyte
                │     Cut     │               CutByte               │
                │   sec/op    │   sec/op     vs base                │
Cut0/skip=2-12    4.172n ± 3%   3.139n ± 1%  -24.75% (p=0.000 n=10)
Cut0/skip=4-12    4.143n ± 2%   3.132n ± 2%  -24.40% (p=0.000 n=10)
Cut0/skip=8-12    4.177n ± 3%   3.174n ± 2%  -24.01% (p=0.000 n=10)
Cut0/skip=16-12   4.151n ± 2%   3.147n ± 1%  -24.17% (p=0.000 n=10)
Cut0/skip=32-12   4.633n ± 3%   3.582n ± 1%  -22.70% (p=0.000 n=10)
Cut0/skip=64-12   4.854n ± 5%   3.936n ± 2%  -18.91% (p=0.000 n=10)
Cut1/skip=2-12    4.172n ± 2%   3.152n ± 2%  -24.45% (p=0.000 n=10)
Cut1/skip=4-12    4.175n ± 3%   3.171n ± 1%  -24.04% (p=0.000 n=10)
Cut1/skip=8-12    4.209n ± 2%   3.161n ± 1%  -24.91% (p=0.000 n=10)
Cut1/skip=16-12   4.232n ± 3%   3.159n ± 1%  -25.33% (p=0.000 n=10)
Cut1/skip=32-12   4.601n ± 5%   3.593n ± 1%  -21.92% (p=0.000 n=10)
Cut1/skip=64-12   5.016n ± 3%   3.857n ± 0%  -23.11% (p=0.000 n=10)
Cut2/skip=2-12    4.153n ± 2%   3.134n ± 1%  -24.52% (p=0.000 n=10)
Cut2/skip=4-12    4.152n ± 1%   3.128n ± 1%  -24.66% (p=0.000 n=10)
Cut2/skip=8-12    4.123n ± 2%   3.135n ± 1%  -23.95% (p=0.000 n=10)
Cut2/skip=16-12   4.154n ± 2%   3.111n ± 2%  -25.13% (p=0.000 n=10)
Cut2/skip=32-12   4.548n ± 3%   3.559n ± 1%  -21.75% (p=0.000 n=10)
Cut2/skip=64-12   4.965n ± 4%   3.853n ± 1%  -22.41% (p=0.000 n=10)
Cut3/skip=2-12    4.098n ± 3%   3.128n ± 1%  -23.67% (p=0.000 n=10)
Cut3/skip=4-12    4.154n ± 3%   3.123n ± 2%  -24.81% (p=0.000 n=10)
Cut3/skip=8-12    4.308n ± 3%   3.114n ± 1%  -27.71% (p=0.000 n=10)
Cut3/skip=16-12   4.181n ± 3%   3.125n ± 1%  -25.27% (p=0.000 n=10)
Cut3/skip=32-12   4.569n ± 3%   3.575n ± 1%  -21.76% (p=0.000 n=10)
Cut3/skip=64-12   4.942n ± 4%   3.883n ± 1%  -21.42% (p=0.000 n=10)
geomean           4.360n        3.324n       -23.76%

                │     Cut      │               CutByte               │
                │     B/op     │    B/op     vs base                 │
Cut0/skip=2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut0/skip=4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut0/skip=8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut0/skip=16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut0/skip=32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut0/skip=64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut1/skip=2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut1/skip=4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut1/skip=8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut1/skip=16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut1/skip=32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut1/skip=64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut2/skip=2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut2/skip=4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut2/skip=8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut2/skip=16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut2/skip=32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut2/skip=64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut3/skip=2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut3/skip=4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut3/skip=8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut3/skip=16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut3/skip=32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut3/skip=64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                      ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                │     Cut      │               CutByte               │
                │  allocs/op   │ allocs/op   vs base                 │
Cut0/skip=2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut0/skip=4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut0/skip=8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut0/skip=16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut0/skip=32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut0/skip=64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut1/skip=2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut1/skip=4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut1/skip=8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut1/skip=16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut1/skip=32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut1/skip=64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut2/skip=2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut2/skip=4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut2/skip=8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut2/skip=16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut2/skip=32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut2/skip=64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut3/skip=2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut3/skip=4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut3/skip=8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut3/skip=16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut3/skip=32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut3/skip=64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                      ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

Thanks to @egonelbre's tip, I changed the benchmarking to the following way

func BenchmarkCut(b *testing.B) {
	b.ReportAllocs()

	for _, skip := range [...]int{2, 4, 8, 16, 32, 64} {
		s := Repeat(Repeat(" ", skip)+"a"+Repeat(" ", skip), 1<<16/skip)
		b.Run(fmt.Sprintf("Func=Cut/%d", skip), func(b *testing.B) {
			for i := 0; i < b.N; i++ {
				_, _, _ = Cut(s, "a")
			}
		})
		b.Run(fmt.Sprintf("Func=CutByte/%d", skip), func(b *testing.B) {
			for i := 0; i < b.N; i++ {
				_, _, _ = CutByte(s, 'a')
			}
		})
	}
}

benchstat -col /Func  new.txt 
goos: darwin
goarch: arm64
pkg: strings
cpu: Apple M2 Max
          │     Cut     │               CutByte               │
          │   sec/op    │   sec/op     vs base                │
Cut/2-12    4.134n ± 2%   3.171n ± 2%  -23.30% (p=0.000 n=10)
Cut/4-12    4.190n ± 2%   3.208n ± 2%  -23.44% (p=0.000 n=10)
Cut/8-12    4.137n ± 2%   3.210n ± 1%  -22.42% (p=0.000 n=10)
Cut/16-12   4.128n ± 1%   3.223n ± 1%  -21.92% (p=0.000 n=10)
Cut/32-12   4.617n ± 1%   3.630n ± 1%  -21.39% (p=0.000 n=10)
Cut/64-12   4.957n ± 1%   3.927n ± 0%  -20.76% (p=0.000 n=10)
geomean     4.349n        3.383n       -22.21%

          │     Cut      │               CutByte               │
          │     B/op     │    B/op     vs base                 │
Cut/2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

          │     Cut      │               CutByte               │
          │  allocs/op   │ allocs/op   vs base                 │
Cut/2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

egonelbre · 2024-04-30T12:36:56Z

@aimuz, I'm storing in global vars to ensure that code doesn't get optimized away. It could avoid any though to be faster.

I updated the code, and I'm getting:

cpu: AMD Ryzen Threadripper 2950X 16-Core Processor
               │     Cut      │               CutByte               │
               │    sec/op    │   sec/op     vs base                │
Cut/skip=2-32     8.306n ± 1%   7.235n ± 2%  -12.90% (p=0.000 n=24)
Cut/skip=4-32     8.362n ± 1%   7.188n ± 1%  -14.04% (p=0.000 n=24)
Cut/skip=8-32     8.433n ± 7%   7.396n ± 2%  -12.30% (p=0.000 n=24)
Cut/skip=16-32    8.347n ± 1%   7.240n ± 1%  -13.25% (p=0.000 n=24)
Cut/skip=32-32    9.220n ± 1%   7.882n ± 1%  -14.51% (p=0.000 n=24)
Cut/skip=64-32   10.050n ± 1%   8.607n ± 1%  -14.35% (p=0.000 n=24)
geomean           8.764n        7.575n       -13.56%

Optimize the Cut function in both the bytes and strings packages to immediately return slices when the separator is a single byte (or character), avoiding more complex index searching logic. This change can significantly reduce the execution time for these specific cases, as benchmark tests added to each package demonstrate improvements. The optimization checks if the length of the separator is one before proceeding with the existing search strategy. If so, it uses IndexByte for a faster lookup of the separator's position. Additionally, benchmark tests have been added for both packages to demonstrate the performance benefits of this optimization across various scenarios. goos: darwin goarch: arm64 pkg: strings cpu: Apple M2 Max │ old-cut.txt │ new-cut.txt │ │ sec/op │ sec/op vs base │ Cut/Cut-One/2-12 4.026n ± 2% 3.274n ± 2% -18.68% (p=0.000 n=10) Cut/Cut-Two/2-12 8.093n ± 0% 8.357n ± 0% +3.27% (p=0.000 n=10) Cut/Cut-One/4-12 4.048n ± 1% 3.324n ± 2% -17.91% (p=0.000 n=10) Cut/Cut-Two/4-12 8.105n ± 0% 8.377n ± 1% +3.35% (p=0.000 n=10) Cut/Cut-One/8-12 4.089n ± 1% 3.290n ± 1% -19.53% (p=0.000 n=10) Cut/Cut-Two/8-12 8.107n ± 1% 8.359n ± 1% +3.10% (p=0.000 n=10) Cut/Cut-One/16-12 4.127n ± 1% 3.328n ± 1% -19.35% (p=0.000 n=10) Cut/Cut-Two/16-12 8.119n ± 1% 8.374n ± 1% +3.15% (p=0.000 n=10) Cut/Cut-One/32-12 4.545n ± 2% 3.675n ± 1% -19.14% (p=0.000 n=10) Cut/Cut-Two/32-12 8.708n ± 1% 8.963n ± 1% +2.92% (p=0.000 n=10) Cut/Cut-One/64-12 4.825n ± 2% 4.146n ± 1% -14.08% (p=0.000 n=10) Cut/Cut-Two/64-12 9.286n ± 0% 9.315n ± 1% ~ (p=0.105 n=10) geomean 5.983n 5.486n -8.32% │ old-cut.txt │ new-cut.txt │ │ B/op │ B/op vs base │ Cut/Cut-One/2-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-Two/2-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-One/4-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-Two/4-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-One/8-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-Two/8-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-One/16-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-Two/16-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-One/32-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-Two/32-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-One/64-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-Two/64-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ geomean ² +0.00% ² ¹ all samples are equal ² summaries must be >0 to compute geomean │ old-cut.txt │ new-cut.txt │ │ allocs/op │ allocs/op vs base │ Cut/Cut-One/2-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-Two/2-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-One/4-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-Two/4-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-One/8-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-Two/8-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-One/16-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-Two/16-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-One/32-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-Two/32-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-One/64-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-Two/64-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ geomean ² +0.00% ² ¹ all samples are equal ² summaries must be >0 to compute geomean For golang#67101

gopherbot · 2024-05-01T13:44:19Z

Change https://go.dev/cl/582655 mentions this issue: bytes, strings: optimize Cut for single-byte separators

aimuz · 2024-05-01T13:45:22Z

If the length is 1, it does improve, but if the length is greater than 1, it will increase accordingly. Maybe this is acceptable?

The code and improved implementation of the benchmark test will be available at https://go.dev/cl/582655

goos: darwin
goarch: arm64
pkg: strings
cpu: Apple M2 Max
                  │ old-cut.txt │             new-cut.txt             │
                  │   sec/op    │   sec/op     vs base                │
Cut/Cut-One/2-12    4.026n ± 2%   3.274n ± 2%  -18.68% (p=0.000 n=10)
Cut/Cut-Two/2-12    8.093n ± 0%   8.357n ± 0%   +3.27% (p=0.000 n=10)
Cut/Cut-One/4-12    4.048n ± 1%   3.324n ± 2%  -17.91% (p=0.000 n=10)
Cut/Cut-Two/4-12    8.105n ± 0%   8.377n ± 1%   +3.35% (p=0.000 n=10)
Cut/Cut-One/8-12    4.089n ± 1%   3.290n ± 1%  -19.53% (p=0.000 n=10)
Cut/Cut-Two/8-12    8.107n ± 1%   8.359n ± 1%   +3.10% (p=0.000 n=10)
Cut/Cut-One/16-12   4.127n ± 1%   3.328n ± 1%  -19.35% (p=0.000 n=10)
Cut/Cut-Two/16-12   8.119n ± 1%   8.374n ± 1%   +3.15% (p=0.000 n=10)
Cut/Cut-One/32-12   4.545n ± 2%   3.675n ± 1%  -19.14% (p=0.000 n=10)
Cut/Cut-Two/32-12   8.708n ± 1%   8.963n ± 1%   +2.92% (p=0.000 n=10)
Cut/Cut-One/64-12   4.825n ± 2%   4.146n ± 1%  -14.08% (p=0.000 n=10)
Cut/Cut-Two/64-12   9.286n ± 0%   9.315n ± 1%        ~ (p=0.105 n=10)
geomean             5.983n        5.486n        -8.32%

                  │ old-cut.txt  │             new-cut.txt             │
                  │     B/op     │    B/op     vs base                 │
Cut/Cut-One/2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-Two/2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-One/4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-Two/4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-One/8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-Two/8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-One/16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-Two/16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-One/32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-Two/32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-One/64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-Two/64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                        ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                  │ old-cut.txt  │             new-cut.txt             │
                  │  allocs/op   │ allocs/op   vs base                 │
Cut/Cut-One/2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-Two/2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-One/4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-Two/4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-One/8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-Two/8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-One/16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-Two/16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-One/32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-Two/32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-One/64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-Two/64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                        ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

aimuz · 2024-05-01T13:48:42Z

Maybe CL 582655 is the best way to go for the current situation? He can get this one optimised without modifying the code in other locations. But for applications with sep greater than 1, he will have a performance loss greater than 2%, is this acceptable?

In go code, I've investigated, and the proportion of sep lengths of 1 is quite large. Maybe we can merge CL 582655?

Optimize the Cut function in both the bytes and strings packages to immediately return slices when the separator is a single byte (or character), avoiding more complex index searching logic. This change can significantly reduce the execution time for these specific cases, as benchmark tests added to each package demonstrate improvements. The optimization checks if the length of the separator is one before proceeding with the existing search strategy. If so, it uses IndexByte for a faster lookup of the separator's position. Additionally, benchmark tests have been added for both packages to demonstrate the performance benefits of this optimization across various scenarios. goos: darwin goarch: arm64 pkg: strings cpu: Apple M2 Max │ old-cut.txt │ new-cut.txt │ │ sec/op │ sec/op vs base │ Cut/Cut-One/2-12 4.026n ± 2% 3.274n ± 2% -18.68% (p=0.000 n=10) Cut/Cut-Two/2-12 8.093n ± 0% 8.357n ± 0% +3.27% (p=0.000 n=10) Cut/Cut-One/4-12 4.048n ± 1% 3.324n ± 2% -17.91% (p=0.000 n=10) Cut/Cut-Two/4-12 8.105n ± 0% 8.377n ± 1% +3.35% (p=0.000 n=10) Cut/Cut-One/8-12 4.089n ± 1% 3.290n ± 1% -19.53% (p=0.000 n=10) Cut/Cut-Two/8-12 8.107n ± 1% 8.359n ± 1% +3.10% (p=0.000 n=10) Cut/Cut-One/16-12 4.127n ± 1% 3.328n ± 1% -19.35% (p=0.000 n=10) Cut/Cut-Two/16-12 8.119n ± 1% 8.374n ± 1% +3.15% (p=0.000 n=10) Cut/Cut-One/32-12 4.545n ± 2% 3.675n ± 1% -19.14% (p=0.000 n=10) Cut/Cut-Two/32-12 8.708n ± 1% 8.963n ± 1% +2.92% (p=0.000 n=10) Cut/Cut-One/64-12 4.825n ± 2% 4.146n ± 1% -14.08% (p=0.000 n=10) Cut/Cut-Two/64-12 9.286n ± 0% 9.315n ± 1% ~ (p=0.105 n=10) geomean 5.983n 5.486n -8.32% │ old-cut.txt │ new-cut.txt │ │ B/op │ B/op vs base │ Cut/Cut-One/2-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-Two/2-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-One/4-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-Two/4-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-One/8-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-Two/8-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-One/16-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-Two/16-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-One/32-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-Two/32-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-One/64-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-Two/64-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ geomean ² +0.00% ² ¹ all samples are equal ² summaries must be >0 to compute geomean │ old-cut.txt │ new-cut.txt │ │ allocs/op │ allocs/op vs base │ Cut/Cut-One/2-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-Two/2-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-One/4-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-Two/4-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-One/8-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-Two/8-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-One/16-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-Two/16-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-One/32-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-Two/32-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-One/64-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Cut/Cut-Two/64-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ geomean ² +0.00% ² ¹ all samples are equal ² summaries must be >0 to compute geomean For golang#67101

aimuz added the Proposal label Apr 29, 2024

gopherbot added this to the Proposal milestone Apr 29, 2024

aimuz linked a pull request Apr 29, 2024 that will close this issue

bytes, strings: add CutByte #67102

Open

aimuz mentioned this issue May 1, 2024

bytes, strings: optimize Cut for single-byte separators #67125

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: bytes, strings: add CutByte #67101

proposal: bytes, strings: add CutByte #67101

aimuz commented Apr 29, 2024 •

edited

gopherbot commented Apr 29, 2024

adonovan commented Apr 29, 2024

mateusz834 commented Apr 29, 2024

ianlancetaylor commented Apr 29, 2024

ianlancetaylor commented Apr 29, 2024

adonovan commented Apr 29, 2024

aimuz commented Apr 30, 2024

aimuz commented Apr 30, 2024

egonelbre commented Apr 30, 2024 •

edited

aimuz commented Apr 30, 2024

egonelbre commented Apr 30, 2024 •

edited

gopherbot commented May 1, 2024

aimuz commented May 1, 2024

aimuz commented May 1, 2024

proposal: bytes, strings: add CutByte #67101

proposal: bytes, strings: add CutByte #67101

Comments

aimuz commented Apr 29, 2024 • edited

Proposal Details

Abstract

Background

Proposal

For the strings package:

For the bytes package:

Rationale

Compatibility

Implementation

Conclusion

gopherbot commented Apr 29, 2024

adonovan commented Apr 29, 2024

mateusz834 commented Apr 29, 2024

ianlancetaylor commented Apr 29, 2024

ianlancetaylor commented Apr 29, 2024

adonovan commented Apr 29, 2024

aimuz commented Apr 30, 2024

aimuz commented Apr 30, 2024

egonelbre commented Apr 30, 2024 • edited

aimuz commented Apr 30, 2024

egonelbre commented Apr 30, 2024 • edited

gopherbot commented May 1, 2024

aimuz commented May 1, 2024

aimuz commented May 1, 2024

aimuz commented Apr 29, 2024 •

edited

For the `strings` package:

For the `bytes` package:

egonelbre commented Apr 30, 2024 •

edited

egonelbre commented Apr 30, 2024 •

edited