How it works
A List uses the
] characters to contain the elements of the list separated by commas. It looks like this:
my_list = [1, 2, 3, 4, 5]
This looks like an Array in other languages. However, a List doesn’t act like an Array.
A list in Elixir (and other functional programming languages) is actually a recursive data structure. Internally implemented as a linked list.
When you realize an Elixir list is an immutable linked list, it’s behavior makes sense and how you use it becomes clear.
my_list is “bound” to the list. The variable isn’t “assigned” the value of the list, it is said to be “bound”. Variables in Elixir are all “bound” to their values. You can think of “bound” as meaning “is pointing to”. So
my_list is pointing to a specific element in a linked list. Each element points to the next element in the list and separately points to the value for the element.
In many languages, it is common to add more items to the end of an Array as it is built up. With an immutable List, we can’t add items to the end as that would be affecting other variables that are bound to some element earlier in the list.
If I really do want to add to the end of the list, I can, but a whole new list is created with that value at the end. When a list is very large, this becomes an expensive operation. Two lists can be concatenated together using the
my_list ++  #=> [1, 2, 3, 4, 5, 6]
Adding to the front of the List, or the “head” is very cheap. It doesn’t alter other variables that are already bound to some element in the List.
This makes it very efficient both in speed and memory consumption. To efficiently add to the front of a list, you use the
| pipe character to join the new element and the existing list together.
[0 | my_list] #=> [0, 1, 2, 3, 4, 5]
It can be cheaper and faster to build up a large list in reverse, adding to the “head” rather than re-building the whole list with each new addition. Then, once built, perform a single “reverse” of the whole list using Enum.reverse/1.
Enum.reverse([6, 5, 4, 3, 2, 1]) #=> [1, 2, 3, 4, 5, 6]
Thinking about a list as a “linked list” data structure where it’s made up of pointers to the actual data helps to understand how immutable data can work and be so memory efficient. The same kind of thing is happening with the other data structures, but I think it’s easiest to start imagining it with a list.
Working with immutable data may be a different experience for you. It probably feels uncomfortable. However, it is what enables concurrency and immutable data removes a whole category of state related bugs. If it is uncomfortable now, just know that understanding and embracing it will help make you a better programmer. And you’ll end up loving it!
See for yourself!
I described how adding to the front of a list is faster than adding to the end. The best way to really “get” this is to see and feel it for yourself.
Below I have 2 one-liner functions that exaggerate the differences so you can really feel it. Don’t worry about understanding the details of the code. Conceptually they both build a list of 100,000 items. They both add one number at a time to the list starting with the number 1 and going up to 100,000.
The difference is one function adds the item to the end of the list using list concatenation. That’s the code that says
acc + [n]. It means
list_so_far ++ [next_number].
Run this line in IEx and expect it to take some time.
Enum.reduce(1..100_000, , fn(n, acc) -> acc ++ [n] end)
Did you see how long that took?
Running Observer (it’s like an activity monitor built in to the BEAM), you can see the load it put on the machine.
The top chart shows the CPU impact and how much time it took. The bottom left chart shows the RAM impact. The saw-tooth line is from garbage collection being done.
Why is it so inefficient? With every item added, it re-creates the entire list so no other references to the list are affected.
Let’s try it again with code adds items to the front of the list. The important part of the code is the
[n | acc] which means
[next_number | list_so_far]. It adds the single item to the front of the list.
Run this line in IEx and see how long it takes to return.
Enum.reduce(1..100_000, , fn(n, acc) -> [n | acc] end)
Did that feel different? Let’s see the system impact of the code.
Notice the chart scale is reduced and that the CPU has one spike that stayed below 30%. The RAM also only slightly bumped.
Why is it so much more efficient? As the list is being built, it never gets re-created. Any other variable references to the list made during this operation won’t be affected by adding more items to the front. So it can just keep adding items to the front efficiently.
Should I only ever add to the front?
You might see this and ask, “should I only ever add items to the front of a list?” The answer is “no”. This example is specially designed to exaggerate the behavior. When building a list of 10, 100, or even 1,000 items, adding to the end of the list probably won’t even be noticed. However, it is important when working with much larger lists.
The point here is to understand how lists work in Elixir so you have the right mental model. Lists are internally implemented as linked list, not an array. You can’t treat them like an array.
Lists don’t just contain integers. Lists can contain any data type and can contain different types from one element to the next.
A list’s contents are ordered. They remain in the order they are declared.
[1, "Hello", 42.0, true, nil]
Experiment with Lists
Take some time to experiment with lists in IEx. Try concatenating lists together with
++, you can also remove elements from a list using
--. Try adding to the front of a list using the
| pipe separator. Try tweaking the list function examples to use different numbers. Experimentation is playing! Have fun playing!