Saturday, May 16, 2015

Go: A Surprising Edge Case Concerning append and Slice Aliasing

In Go, the append function reallocates the underlying array if there's no more room. Hence, if you have two slices that point to the same array, after you append to one of the slices, the other slice may or may not point to the same array. This can lead to some unexpected behavior:

package main

import "fmt"

func main() {
 // Create a slice of length 1 and capacity 2.
 a := make([]int, 1, 2)

 // b "aliases" a
 b := a

 // If I set something in a, you see it in b.
 a[0] = 1
 fmt.Println(a[0], b[0]) // Prints 1 1; they're equal.

 // append returns a new slice, but with the same array.
 // Hence, if I set something in a, you still see it in b.
 a = append(a, 2)
 a[0] = 2
 fmt.Println(a[0], b[0]) // Prints 2 2; they're equal.

 // I'm doing the same thing as last time. However, this time,
 // append has to allocate a new array because there's not enough
 // space. Hence, if I set something in a, you don't see it in b.
 a = append(a, 3)
 a[0] = 3
 fmt.Println(a[0], b[0]) // Prints 3 2; they're *not* equal.
}

Having multiple aliases to the same mutable data can lead to some subtle behavior and lots of bugs if you're not careful. It's perhaps even more surprising to the naive reader since a and b don't look like pointers. In fact, they're not. They're slices which are a value type that have a pointer within them.

Immutable data structures (like lists in Lisp) don't have these issues. However, you lose some of the performance that a raw array can provide.


10 comments:

Nick said...

This is called, as I'm sure you know, a dangling pointer. One of the many endemic problems with this little language we all tried so hard to love.

Vlad Vissoultchev said...

Actually doubt very much `b` is dangling to deallocated memory in this case.

Anonymous said...

Prints 2. That's not a dangling pointer. One of the many endemic problems with this little developer that the language tried so hard to help.

Brian Olson said...

Nothing to see here. The contract for append() has always been that it might return the original pointer or it might copy-and-extend into a new pointer. That you can keep a reference of the previous version is programmer error for not having clear ownership of that pointer. There should be one canonical context for it, being a member of a struct or a local variable on the stack. Then you don't have to worry about having two copies of it. If you need multiple threads accessing the pointer, then you should have put a mutex on it.

Anonymous said...

> Immutable data structures (like lists in Lisp) don't have these issues. However, you lose some of the performance that a raw array can provide.

Um, what?

Lisp lists are very highly mutable. Aaaand they're not arrays.

Shannon Behrens said...

What I mean is that adding to the head of a list in Lisp does not impact the original list.

Nate Finch said...

It's not an alias. a := b copies the value of b into a.

Shannon Behrens said...

> It's not an alias. a := b copies the value of b into a.

Yes, I know. However, a and b are both references to the same underlying array. "alias" was the best term I could come up with.

Akshay said...

It is very clearly defined in Effective Go
Hence it is not a edge case

"Slices hold references to an underlying array, and if you assign one slice to another, both refer to the same array."

https://golang.org/doc/effective_go.html#slices

Shannon Behrens said...

This is known as the "Stale" Slice problem: http://devs.cloudimmunity.com/gotchas-and-common-mistakes-in-go-golang/index.html#stale_slices