reflect: map iteration does unnecessary excessive allocation for non-pointer types #32424

ugorji · 2019-06-04T10:05:09Z

Given a map[string]*T or map[uint16]T or many other types, iterating via reflection is expensive, because we will allocate a new value on the heap each time for very many types.

In a call to reflect.MapKeys, MapIndex, MapIter.Key, MapIter.Value, the implementation includes the following code:

func copyVal(typ *rtype, fl flag, ptr unsafe.Pointer) Value {
	if ifaceIndir(typ) {
		// Copy result so future changes to the map won't change the underlying value.
		c := unsafe_New(typ)
		typedmemmove(typ, c, ptr)
		return Value{typ, c, fl | flagIndir}
	}
	return Value{typ, *(*unsafe.Pointer)(ptr), fl}
}

The ifaceIndir call returns true for very many types, including string, intXXX, uintXXX, bool, []uint8 ([]byte), etc.

This causes an allocation to be done when retrieving each map key or value.

Iterating through a large map[int][]byte for example, will cause an allocation for each key or value retrieved from the map.

Without reflection, this doesn't happen. And it need not happen.

It has the effect of causing an explosion in memory use of my benchmarks. Here, I have a map[string]*codec.stringUint64T with about 8192 entries. And I have other maps in here also.

Sample benchmark result:

Benchmark__Json_______Encode-8   	     625	   7503851 ns/op	     432 B/op	       4 allocs/op
Benchmark__Json_______Encode-8   	     297	  15948731 ns/op	  528784 B/op	   32879 allocs/op

The first result was manual using range operator, while the second result was via reflection. The number of allocation jumped from 4 to 32879 per op.

To illustrate, I ran my benchmark which includes a map[string]*codec.stringUint64T with about 8192 entries.

go test -run Nothing -tags "alltests codecgen" -bench "__Json_*En" -benchmem -benchtime 1x  | grep -v -E "^(goos:|goarch:|pkg:|PASS|ok)"

I then instrumented the copyVal function to validate it.

var copyValMap = make(map[*rtype]bool)
var copyValMapNoIface = make(map[*rtype]bool)

// copyVal returns a Value containing the map key or value at ptr,
// allocating a new variable as needed.
func copyVal(typ *rtype, fl flag, ptr unsafe.Pointer) Value {
	if ifaceIndir(typ) {
		if _, ok := copyValMap[typ]; !ok {
			println("copyVal: ifaceIndir: ", typ, typ.String())
			copyValMap[typ] = true
		}
		// Copy result so future changes to the map won't change the underlying value.
		c := unsafe_New(typ)
		typedmemmove(typ, c, ptr)
		return Value{typ, c, fl | flagIndir}
	}
	if _, ok := copyValMapNoIface[typ]; !ok {
		println("copyVal: NO   iface: ", typ, typ.String())
		copyValMapNoIface[typ] = true
	}
	return Value{typ, *(*unsafe.Pointer)(ptr), fl}
}

copyVal: ifaceIndir:  0x14d8880 string
copyVal: ifaceIndir:  0x14d7b00 int64
copyVal: ifaceIndir:  0x14d2380 []uint8
copyVal: NO   iface:  0x1521500 *codec.stringUint64T
copyVal: ifaceIndir:  0x14d8bc0 uint16
copyVal: NO   iface:  0x15213c0 *codec.TestStruc
copyVal: ifaceIndir:  0x1553820 codec.TestStruc

I then instrumented copyVal again to track the number of times that the allocation happened.
It was as expected (>32000 times).

var copyValMap = make(map[*rtype]int)
var copyValMapNoIface = make(map[*rtype]int)

// copyVal returns a Value containing the map key or value at ptr,
// allocating a new variable as needed.
func copyVal(typ *rtype, fl flag, ptr unsafe.Pointer) Value {
	if ifaceIndir(typ) {
		copyValMap[typ]++
		if copyValMap[typ]%1000 == 0 {
			println("copyVal: ifaceIndir: ", typ, typ.String(), copyValMap[typ])
		}
		// Copy result so future changes to the map
		// won't change the underlying value.
		c := unsafe_New(typ)
		typedmemmove(typ, c, ptr)
		return Value{typ, c, fl | flagIndir}
	}
	copyValMapNoIface[typ]++
	if copyValMapNoIface[typ]%1000 == 0 {
		println("copyVal: NO   iface: ", typ, typ.String(), copyValMapNoIface[typ])
	}
	return Value{typ, *(*unsafe.Pointer)(ptr), fl}
}

copyVal: ifaceIndir:  0x14d8960 string 32000
copyVal: NO   iface:  0x15215e0 *codec.stringUint64T 32000

What version of Go are you using (`go version`)?

$ go version
go version devel +1531192272 Mon May 27 17:58:39 2019 +0000 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (`go env`)?

go env Output

$ go env
GO111MODULE="on"
GOARCH="amd64"
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/bt/526m392x16x8y149jwx28mkm0000gp/T/go-build760118559=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

What did you expect to see?

What did you see instead?

The text was updated successfully, but these errors were encountered:

ugorji · 2019-06-04T13:36:34Z

See attached

github_golang_issue_32424_test.go.txt

Please rename the file to .go, and run as below:

mv github_golang_issue_32424_test.go.txt github_golang_issue_32424_test.go
go test -tags "32424" -run Nothing -bench 'Map(Range|ReflectIter|ReflectIndex)' -benchmem  -cpuprofile cpu.out -memprofile mem.out -memprofilerate 1

Got results:

BenchmarkMapRange-8          	   89374	     12812 ns/op	       0 B/op	       0 allocs/op
BenchmarkMapReflectIter-8    	     660	   1805001 ns/op	   16546 B/op	    1026 allocs/op
BenchmarkMapReflectIndex-8   	     576	   2146006 ns/op	   41088 B/op	    1026 allocs/op

For an operation that shouldn't have any allocations, we have a lot.

$ go tool pprof mem.out 
(pprof) top 2
Showing nodes accounting for 38.34MB, 93.02% of 41.21MB total
Dropped 163 nodes (cum <= 0.21MB)
Showing top 2 nodes out of 20
      flat  flat%   sum%        cum   cum%
   22.47MB 54.52% 54.52%    22.47MB 54.52%  reflect.copyVal
   15.87MB 38.50% 93.02%    26.51MB 64.32%  reflect.Value.MapKeys

The allocation is mostly from copyVal (which I traced as being called for each time we convert a map key or value into a reflect.Value, where we call unsafe_New because interfaceIndir(t) returns true.

For Reflect using MapIndex, the call to MapKeys allocates a slice with same length as the map, which is a reason for why MapIter is better. But that (MapIter) still has allocation because of the copyVal function call.

As you see above, with normal range, there is no allocation. And the allocation is completely because the copyVal code does a seemingly unnecessary allocation for composing a reflect.Value from the map key or value.

ugorji · 2019-06-04T14:15:07Z

From my understanding, the reason for this is that reflection works off interface{}, and so we MUST create a pointer for each non-pointer that is wrapped in an interface.

I am hoping there is a way to create an API that works around this. For example, we could pass in a reflect.Value to each call to MapIter.Key() or Value(), and that would be used to pass on Set the value on each iteration?

randall77 · 2019-06-04T14:57:19Z

From my understanding, the reason for this is that reflection works off interface{}, and so we MUST create a pointer for each non-pointer that is wrapped in an interface.

Correct.

I am hoping there is a way to create an API that works around this. For example, we could pass in a reflect.Value to each call to MapIter.Key() or Value(), and that would be used to pass on Set the value on each iteration?

That's a possibility. That would be a pretty ugly API though.

In general there are lots of places (not just map iteration) where reflect is slow because we're putting non-pointers in interfaces. We've historically come down on the side of cleaner API even though there might be faster ways with an uglier API.

josharian · 2019-06-04T15:20:33Z

For single byte things, we could use runtime.staticbytes. For non-large zero-valued things we could use runtime.zeroval. I don’t know how common those cases are.

ugorji · 2019-06-04T15:47:47Z

@randall77

I am hoping there is a way to create an API that works around this. For example, we could pass in a reflect.Value to each call to MapIter.Key() or Value(), and that would be used to pass on Set the value on each iteration?

That's a possibility. That would be a pretty ugly API though.

In general there are lots of places (not just map iteration) where reflect is slow because we're putting non-pointers in interfaces. We've historically come down on the side of cleaner API even though there might be faster ways with an uglier API.

It's just that I can make a workaround for all other types because I control their location in memory i.e. for a (u)int(|8|16|32|64), float(32|64), bool, string, I can put these in a struct, slice or array or standalone variable, and make an addressable reference. For slices/array (pointers), all elements are already addressable (in general).

However, for maps, because the "slot" can be reused and reclaimed under the hood, we do not expose the allocated space and MUST always re-allocate to expose the entries as an interface{}.

It is thus the case that only map iteration has no workaround.

This is why the option of passing a reference is the only workaround specifically for maps.

I am currently looking to mimic such an interface, leveraging unsafe and cull'ing from the reflection codebase, for API below:

//- look up key, and set val (expecting settable)
//- if return false, then no entry for that key
mapIndex(mapp reflect.Value, key reflect.Value, val reflect.Value) bool
//- creates an iterator, using key and val (expecting both settable) to set the next entries on each iteration
// - if any of key or val is invalid, treat as if a _ was passed to "for ... range"
// 
//- if Next() returns true, then key and val have been set to the next iteration key/value pair
//- if Next() returns false, no more entries
mapIter(key, val reflect.Value) *mapIter
(*mapIter).Next()

IMO, it is not an inelegant API. If retrofited to the current reflection API, it would look like:

// Look up key, and set its mapping into val, returning true if mapping exists
Value.MapIndexUsing(key, val reflect.Value) bool
// Create an iterator, using key and val for each key/value mapping
// Treat as _ if invalid (zero value) allowing us simulate for k, _ = range; _, v = range; k, v = range
Value.MapRangeUsing(key, val reflect.Value) *MapIter

Again, I personally do not see this as an inelegant or unweildly API.

My usecase is the go-codec encoding library: https://github.com/ugorji/go. This is the one place where code-generation is far and away more performant than reflection-based.

Others "encoders" in the space e.g. json-iterator went the way of completely re-writing the reflection package, which I think is extremely fraught with peril: see https://github.com/json-iterator/go https://github.com/modern-go/reflect2 . But they have gotten a lot of mileage doing this, gaining performance equivalent to static codebases.

randall77 · 2019-06-04T17:02:35Z

We could add methods to *reflect.MapIter:

// SetKey does dst.Set(it.Key()).
func (it *MapIter) SetKey(dst Value) { ... }

(and the same for values)

I think that would provide an allocation-free path.

It seems strange to me that we'd provide this optimized path for map iteration, and not for other things that construct a Value. Just scanning value.go, you could make a case for *Recv, Select, Convert, MapIndex, Zero, and Append.

ugorji · 2019-06-04T17:47:31Z

Nice - the methods on *reflect.MapIter work nicely!

I would love to have for MapIndex also, as that is a common use-case. That is the Value.MapIndexUsing(key, dst reflect.Value) reflect.Value above.

Specifically regarding Zero, I always assumed that it would be allocation-free. I thought it would be sufficient to return a readonly non-addressable and non-settable static value as the zero value. That way, every caller gets the same one, but they cannot do much with it except set something else with it or use it readonly.

Regarding Append, there is a somewhat allocation free path with AppendSlice (as you can pass a slice of non-pointers). You can also pass Value which are addressable elements of a slice. So - there is not much of a gap here.

The remaining gaps will be the chan ones (Recv, Select) and Convert. Those are seemingly less common in typical usage (in my limited experience - YMMV). I also do not have a common use for them and do not have any good ideas of how to present it on the API.

We do this by introducing safe and unsafe 1.12+ variants for map iteration and map indexing. This is necessary due to golang/go#32424 - Map Iteration using reflection always causes allocation because every key and value might have to be converted to an interface{} first, meaning scalars may be allocated on the heap. - Map Indexing also causes allocation We workaround both by implementing a copy-free allocation where a holder value is passed into the indexing or iterating operation, and we "copy" the value into that pointer. Also, improve heuristics of decInferLen Also, fix lint, staticcheck and other issues raised via static analysis. Clean out xdebugf comments and remove much old commented code Finally, make more changes to assist the bounds check elimination part of compiler.

JAicewizard · 2020-01-27T17:45:28Z

I personally would love this, map iteration is a major allocation pain-point where it is being used, and is significantly slower compared to either type-specific cases, or code-gen. Personally I am seeing a ~3x increase in encoding removing special cases for common maps(map[string]interface{}, map[interface{}]interface{} and map[string]string). I was benchmarking those types (map[interface{}]interface{} specifically) but those are the most common map types.

I don't thinks a generic send/recv for channels is very common, so I personally wouldn't see the value if its any difficulty implementing. But if those are trivial to do then I would do so for the sake of keeping the same API.

randall77 · 2020-01-27T21:57:28Z

Zero is being handled in #33136 .

I agree that the channel ones are probably not worth worrying about, performance-wise.

Convert I'm not sure.

But Slice (and the ones that call Slice, like Append and AppendSlice) would be even more important than the map ones, I suspect. @ugorji I didn't understand how you think Append could be made allocation free - could you explain?

martisch · 2020-05-31T16:14:55Z

I came by the reflect.Append case allocating a new SliceHeader in value.Slice (called by the grow function unconditionally) in an append loop in the last weeks and optimized it by using reflect.MakeSlice and reflectSliceValue.Index(i).Set(...).

If we can reuse the existing pointed to SliceHeader in the reflect.Value and just set the underlying len to the new length we may be able to avoid an allocation per Append call. Seems like this would need a new API e.g. reflect.AppendInplace(s Value, x ...Value) (no return value) since it would be a behaviour change.

ugorji · 2021-03-08T02:49:10Z

@randall77 Sorry, I missed this earlier.

Zero is being handled in #33136 .

I agree that the channel ones are probably not worth worrying about, performance-wise.

Convert I'm not sure.

But Slice (and the ones that call Slice, like Append and AppendSlice) would be even more important than the map ones, I suspect. @ugorji I didn't understand how you think Append could be made allocation free - could you explain?

As @Martish alluded to, a new API like AppendInPlace(s Value, x ...Value) will be needed.

In the same vein, AppendSlice() can elide the allocation caused by the call to s.Slice, by doing something akin to:

Copy(s.Slice(i0, i1), t)

to

p := (*unsafeheader.Slice)(s.ptr)
p0 := *p
p.Data = arrayAt(p0.Data, i0, typ.elem.Size(), "i < cap")
p.Len = i1 - i0
Copy(s, t)
*p = p0

These augment the existing MapIter.Key and MapIter.Value methods. The existing methods return new Values. Constructing these new Values often requires allocating. These methods allow the caller to bring their own storage. The naming is somewhat unfortunate, in that the spec uses the word "element" instead of "value", as do the reflect.Type methods. In a vacuum, MapIter.SetElem would be preferable. However, matching the existing methods is more important. Fixes golang#32424 Fixes golang#46131 Change-Id: I19c4d95c432f63dfe52cde96d2125abd021f24fa

gopherbot · 2021-05-18T16:43:46Z

Change https://golang.org/cl/320929 mentions this issue: reflect: add MapIter.SetKey and MapIter.SetValue

These augment the existing MapIter.Key and MapIter.Value methods. The existing methods return new Values. Constructing these new Values often requires allocating. These methods allow the caller to bring their own storage. The naming is somewhat unfortunate, in that the spec uses the word "element" instead of "value", as do the reflect.Type methods. In a vacuum, MapIter.SetElem would be preferable. However, matching the existing methods is more important. Fixes golang#32424 Fixes golang#46131 Change-Id: I19c4d95c432f63dfe52cde96d2125abd021f24fa

bradfitz added the Performance label Jun 4, 2019

dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 10, 2019

dmitshur added this to the Go1.14 milestone Jun 10, 2019

rsc modified the milestones: Go1.14, Backlog Oct 9, 2019

martisch mentioned this issue Mar 9, 2021

reflect: no efficient way to append to, or grow, a slice #44870

Closed

zephyrtronium mentioned this issue Apr 10, 2021

cmd/compile, runtime: GOEXPERIMENT to add two non-pointer words to iface/eface #45494

Open

randall77 mentioned this issue May 11, 2021

proposal: reflect: add alloc-free way retrieve value from Value #46093

Closed

josharian mentioned this issue May 11, 2021

internal/deepprint: optimize tailscale/tailscale#1902

Merged

rsc mentioned this issue May 12, 2021

reflect: add MapIter.SetKey, SetValue, and friends #46131

Closed

gopherbot closed this as completed in 54cdef1 Aug 24, 2021

martisch mentioned this issue Aug 27, 2021

reflect: add reflect.Value.Grow #48000

Closed

golang locked and limited conversation to collaborators Aug 24, 2022

gopherbot added the FrozenDueToAge label Aug 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reflect: map iteration does unnecessary excessive allocation for non-pointer types #32424

reflect: map iteration does unnecessary excessive allocation for non-pointer types #32424

ugorji commented Jun 4, 2019

ugorji commented Jun 4, 2019

ugorji commented Jun 4, 2019

randall77 commented Jun 4, 2019

josharian commented Jun 4, 2019

ugorji commented Jun 4, 2019

randall77 commented Jun 4, 2019

ugorji commented Jun 4, 2019

JAicewizard commented Jan 27, 2020 •

edited

Loading

randall77 commented Jan 27, 2020

martisch commented May 31, 2020 •

edited

Loading

ugorji commented Mar 8, 2021

gopherbot commented May 18, 2021

reflect: map iteration does unnecessary excessive allocation for non-pointer types #32424

reflect: map iteration does unnecessary excessive allocation for non-pointer types #32424

Comments

ugorji commented Jun 4, 2019

What version of Go are you using (go version)?

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using (go env)?

What did you do?

What did you expect to see?

What did you see instead?

ugorji commented Jun 4, 2019

ugorji commented Jun 4, 2019

randall77 commented Jun 4, 2019

josharian commented Jun 4, 2019

ugorji commented Jun 4, 2019

randall77 commented Jun 4, 2019

ugorji commented Jun 4, 2019

JAicewizard commented Jan 27, 2020 • edited Loading

randall77 commented Jan 27, 2020

martisch commented May 31, 2020 • edited Loading

ugorji commented Mar 8, 2021

gopherbot commented May 18, 2021

What version of Go are you using (`go version`)?

What operating system and processor architecture are you using (`go env`)?

JAicewizard commented Jan 27, 2020 •

edited

Loading

martisch commented May 31, 2020 •

edited

Loading