Episode 20 - Arrays in Pure Bash This episode introduces one of the bash compound data structures, arrays. Crib sheet for what Dave shows: 1. Array assignment, varname=(list), where list is simply any text, subjected to bash word-splitting, each word getting assigned a position. 2. Expanding a full array, so that it isn't subject to new word splitting: "${varname[@]}" 3. Expanding a full array, so that it is subject to any word splitting at all: "${varname[*]}" 4. String quoting with literal newlines in between the double-quotes. This isn't really discussed in the episode, but if you hadn't seen it before, it's worth noting. Same thing works with single-quotes, and most other bash structures. I'll go over some things Dave didn't cover in the episode. - The difference between "${varname[*]}" and ${varname[*]}, and how "${varname[*]}" interacts with $IFS - Checking the size of an array. - ${!varname[@]} and ${!varname[*]}. - Indexing individual array elements, and assigning individual array elements. - The other ways to declare an array. - Adding elements to an array. - Weird corner cases that can arise from assigning and unassigning individual elements directly. My git repo ( github.com/extrageneity/ysap-comments/ ) includes an additional script you can look at, at github.com/extrageneity/ysap-comments/blob/main/pt020-bash-arrays/more-bash-arrays.sh , looking at much of this in more detail. Dave shows two of the four possible ways you can expand an array: - "${varname[@]}", which expands the array such that each index is a distinct positional parameter without being subjected to further bash word splitting. - "${varname[*]}", which joins every element of the array with the first character of $IFS as a field separator, such that the joined string is a single positional parameter, not subject to further word splitting. - ${varname[@]} and ${varname[*]}, which are functionally identical: the array is expanded, and is then subject to further word splitting by bash. In the extremely rare case of IFS being the null string such that no further word splitting happens, _both_ behave the same way as "${varname[@]}", which I found surprising. What else can we do, subscripting with @ and *? First, we can check the size of the array, with either ${#varname[@]} or ${#varname[*]}. Formally, this is the number of elements in the array, which isn't always the same as the last index minus 1. In bash it's possible to have an array where certain subscripts aren't assigned at all. . In that example, you'll also see the other thing you can do by subscripting with @ or *, which is do either ${!varname[@]} or ${!varname[*]}. This expands to all keys/subscripts which exist in the array, rather than expanding to all values. As you'll see in the more-bash-scripts array2 example, we can also initialize array fields directly by subscript using varname[subscript]=value. If we initialize only subscripts 0 and 3 in an array, we end up in the unusual scenario where the numeric value of ${#varname[@]} is smaller than the numeric value of the last index in the array. Similarly, if you do unset varname[subscript], the key is simply deleted from the array. Other array members are not left-shifted automatically. You can declare an empty array in two ways. You can say declare -a varname, or you can say varname=(). Both initialize an array with no elements. Bash will also implicitly declare arrays for you running certain builtins. mapfile, also known as readarray, is intended to work with arrays directly, and loads lines into an array similarly to how a 'while read' loop does. The read builtin will also load variables into an array with the -a switch. You can add additional values to an array using varname+=(list). When you're working with array subscripts, during either assignment or variable expansion, the subscript is a math expression, just as in (( )) bash arithmetic syntax. If your math expression evaluates to a negative number, that number is subtracted from the rightmost index in the array in order to decide what array member to subscript into. Everything we've discussed above applies to a form of arrays bash calls "indexed arrays." There's a separate data-type, "associative arrays," which are more similar to hashes or K/V pairs from other languages. I won't discuss those here, in part because I expect Dave will do an episode using them eventually, and in part because UA-cam has a comment length limit and if I hit it I'm going to spend hours re-writing to get under it... again. Conclusions. How is this a "you suck at programming" topic? Dave's point here was, again, bash word splitting can get you in trouble if you're working with data where your values contain characters bash is splitting against--and that this can be a reason to use arrays instead of just using large single variables with something like newlines in it, and then letting bash word splitting slice up the variable for you. And that's definitely important. Getting the most out of bash means understanding how word-splitting works, how to take advantage of it when you want to, and how to get around it when you don't want it. Arrays are one of the most powerful options you have when getting around word splitting is what you need. But there are other reasons you will suck less at bash programming if you master this concept. Set aside for the moment the most basic one, which is that it's hard to do much of anything valuable in a programming language without at least some access to compound data-structures. Bash has capabilities you can't access without arrays. I already mentioned a couple of those above. That mapfile builtin, which is too complex to discuss here, is disgustingly potent in certain use cases. One other important place where you end up working with arrays in bash? $PIPESTATUS is a special variable that bash sets any time you run a piped expression. Given the example command: command1 | command2 | command3 ... after that command completes, $PIPESTATUS will be an array where: ${PIPESTATUS[0]} contains the return code of command1 ${PIPESTATUS[1]} contains the return code of command2 ${PIPESTATUS[2]} contains the return code of command3 ... and $? will contain the return code of the entire pipeline, normally the same as ${PIPESTATUS[2]}. PIPESTATUS is ephemeral in the same way $? is, meaning it will change the next time you run anything bash treats as a pipeline. You want enough array knowledge to be able to copy it off to a separate variable when you want to look at status codes for a specific pipeline. You can do something like: my_important_pipestatus=( "${PIPESTATUS[@]}" ) ... to do that. There are other interesting ones too, if you want to look at them all you can run: set | grep '=(' Then, reference them in the bash man page. So, that's it. For now, anyway. Arrays in bash. Indexed arrays, as they're known more formally. Another tool in your toolbox, now hopefully de-mystified. When you're ready for additional reading, see the 'Arrays' section of the PARAMETERS chapter in the bash man page, and then the 'Parameter Expansion' section of the "EXPANSION" chapter. There are a number of array-specific parameter expansions I haven't covered here, for example array slicing with ${varname[@]:offset:length}. I won't cover those exhaustively; hopefully after reading this, you have a sense of where to start in the man page itself. Happy indexing!
Very useful script. You missed a couple of echo (for style) on L:49 and L:97. Would you mind a pull request? I recently started learning git for managing my dotfiles, but I have yet to push anything in GitHub, I could try it.
How much chatgpt (if any) are you leveraging for these comments? i've been curious for a bit and have never asked. I ask now because sometimes I find the shift in tone in your comments from a directed style of communication to a more passive-voice to be surprising, and also a lot of the comment end with "Happy _____!" which i see *a lot* when dealing with LLMs. LET ME BE CLEAR - i don't ask in a weird snide/passive-aggressive way - i'm literally genuinely curious if these comments are assisted at all, or are 100% your own creation - because EITHER WAY i'm impressed man lmao. These show some serious dedication to the "extra" in your name, as you've said, haha and i love them. I'm gonna shout-out these comments in a video on insta/tiktok to let people know they can go find further reading / the appendix in my comments section 😎 i love the dedication bro.
Great video! But my eyes suck... I would love to see LARGER TEXT. :) thank you!! Low contrast themes also make it difficult to see in a brighter setting. :)
There are reasons to use arrays in bash even when doing things which are purely native to bash. For example, say you have a script which is doing something like: command1 | command2 | command3 ... and your exception handler needs to look not just at the output of the pipeline, but at exit codes of each command. To do that, you can inspect ${PIPESTATUS[0]}. What is PIPESTATUS? A bash array! Say you want to save the PIPESTATUS array off to a different variable, because in the course of inspecting PIPESTATUS and output from that command you're going to run a different pipeline. _my_original_pipestatus=( "${PIPESTATUS[@]}" ) Congratulations, you just initialized a new array that won't be blown away the next time you run a pipe. There are plenty of reasons to switch from bash to another language. Needing arrays, or even associative arrays, shouldn't be the threshold.
@mattymattffs A stochastic parrot seen in real life! I can't believe it! Such a non-rare sight! At least quote Primeagen when it comes to it man. Arrays in bash have their uses if you understand them, you just have to be a sufficiently advanced shell user to leverage them correctly. Of course, it's only with experience and (not necessarily) niche use cases that they should be used in. For example: What if you are a red teamer and you discovered a bunch of hosts that don't have internet and nothing installed, except bash. You know those hosts can communicate with an internal git server, which you gained control over. You manage to manually deploy a service on these machines that'll create a branch on the git server where the name of the branch is the hostname of the machine. Now, on your attacker machine, you are writing a c2 interface to control all of these machines via the branches as a communication channel. Of course, you want a mechanism to control machines you've "infected". With git-branch you can list all the branches, but you want them in a more manageable format, hm, maybe an ARRAY! Of course, since you know you bash, you know that the perfect command for that is mapfile -t machines <
I managed 1d arrays but I was trying to do a 2d array with with 5 strings per element. No problem with Python or C, but bash I can't do it. Is it even possible?
bash doesn't have 2-dimensional arrays, but it does have ways in parameter expansion of referencing variable names indirectly, which means that you can bodge it together yourself. Typically when I've needed something like that I would just build up 5 associative arrays, 1 for each of the 5 strings I needed, and have the key in each of those associative arrays be the same. There are even ways using indirect variable references to do this for sets of arbitrary length.
@@yousuckatprogramming i meant a array in a file for installing multpile packages at a time. Instead of list i use a file with the packages names in it.
More knowledge in 3 minutes than a whole Medium article
🎉🎉
Episode 20 - Arrays in Pure Bash
This episode introduces one of the bash compound data structures, arrays.
Crib sheet for what Dave shows:
1. Array assignment, varname=(list), where list is simply any text, subjected to bash word-splitting, each word getting assigned a position.
2. Expanding a full array, so that it isn't subject to new word splitting: "${varname[@]}"
3. Expanding a full array, so that it is subject to any word splitting at all: "${varname[*]}"
4. String quoting with literal newlines in between the double-quotes. This isn't really discussed in the episode, but if you hadn't seen it before, it's worth noting. Same thing works with single-quotes, and most other bash structures.
I'll go over some things Dave didn't cover in the episode.
- The difference between "${varname[*]}" and ${varname[*]}, and how "${varname[*]}" interacts with $IFS
- Checking the size of an array.
- ${!varname[@]} and ${!varname[*]}.
- Indexing individual array elements, and assigning individual array elements.
- The other ways to declare an array.
- Adding elements to an array.
- Weird corner cases that can arise from assigning and unassigning individual elements directly.
My git repo ( github.com/extrageneity/ysap-comments/ ) includes an additional script you can look at, at github.com/extrageneity/ysap-comments/blob/main/pt020-bash-arrays/more-bash-arrays.sh , looking at much of this in more detail.
Dave shows two of the four possible ways you can expand an array:
- "${varname[@]}", which expands the array such that each index is a distinct positional parameter without being subjected to further bash word splitting.
- "${varname[*]}", which joins every element of the array with the first character of $IFS as a field separator, such that the joined string is a single positional parameter, not subject to further word splitting.
- ${varname[@]} and ${varname[*]}, which are functionally identical: the array is expanded, and is then subject to further word splitting by bash. In the extremely rare case of IFS being the null string such that no further word splitting happens, _both_ behave the same way as "${varname[@]}", which I found surprising.
What else can we do, subscripting with @ and *? First, we can check the size of the array, with either ${#varname[@]} or ${#varname[*]}. Formally, this is the number of elements in the array, which isn't always the same as the last index minus 1. In bash it's possible to have an array where certain subscripts aren't assigned at all. . In that example, you'll also see the other thing you can do by subscripting with @ or *, which is do either ${!varname[@]} or ${!varname[*]}. This expands to all keys/subscripts which exist in the array, rather than expanding to all values.
As you'll see in the more-bash-scripts array2 example, we can also initialize array fields directly by subscript using varname[subscript]=value. If we initialize only subscripts 0 and 3 in an array, we end up in the unusual scenario where the numeric value of ${#varname[@]} is smaller than the numeric value of the last index in the array. Similarly, if you do unset varname[subscript], the key is simply deleted from the array. Other array members are not left-shifted automatically.
You can declare an empty array in two ways. You can say declare -a varname, or you can say varname=(). Both initialize an array with no elements. Bash will also implicitly declare arrays for you running certain builtins. mapfile, also known as readarray, is intended to work with arrays directly, and loads lines into an array similarly to how a 'while read' loop does. The read builtin will also load variables into an array with the -a switch.
You can add additional values to an array using varname+=(list).
When you're working with array subscripts, during either assignment or variable expansion, the subscript is a math expression, just as in (( )) bash arithmetic syntax. If your math expression evaluates to a negative number, that number is subtracted from the rightmost index in the array in order to decide what array member to subscript into.
Everything we've discussed above applies to a form of arrays bash calls "indexed arrays." There's a separate data-type, "associative arrays," which are more similar to hashes or K/V pairs from other languages. I won't discuss those here, in part because I expect Dave will do an episode using them eventually, and in part because UA-cam has a comment length limit and if I hit it I'm going to spend hours re-writing to get under it... again.
Conclusions.
How is this a "you suck at programming" topic? Dave's point here was, again, bash word splitting can get you in trouble if you're working with data where your values contain characters bash is splitting against--and that this can be a reason to use arrays instead of just using large single variables with something like newlines in it, and then letting bash word splitting slice up the variable for you. And that's definitely important. Getting the most out of bash means understanding how word-splitting works, how to take advantage of it when you want to, and how to get around it when you don't want it. Arrays are one of the most powerful options you have when getting around word splitting is what you need.
But there are other reasons you will suck less at bash programming if you master this concept. Set aside for the moment the most basic one, which is that it's hard to do much of anything valuable in a programming language without at least some access to compound data-structures. Bash has capabilities you can't access without arrays. I already mentioned a couple of those above. That mapfile builtin, which is too complex to discuss here, is disgustingly potent in certain use cases.
One other important place where you end up working with arrays in bash? $PIPESTATUS is a special variable that bash sets any time you run a piped expression. Given the example command:
command1 | command2 | command3
... after that command completes, $PIPESTATUS will be an array where:
${PIPESTATUS[0]} contains the return code of command1
${PIPESTATUS[1]} contains the return code of command2
${PIPESTATUS[2]} contains the return code of command3
... and $? will contain the return code of the entire pipeline, normally the same as ${PIPESTATUS[2]}.
PIPESTATUS is ephemeral in the same way $? is, meaning it will change the next time you run anything bash treats as a pipeline. You want enough array knowledge to be able to copy it off to a separate variable when you want to look at status codes for a specific pipeline. You can do something like:
my_important_pipestatus=( "${PIPESTATUS[@]}" )
... to do that. There are other interesting ones too, if you want to look at them all you can run:
set | grep '=('
Then, reference them in the bash man page.
So, that's it. For now, anyway. Arrays in bash. Indexed arrays, as they're known more formally. Another tool in your toolbox, now hopefully de-mystified. When you're ready for additional reading, see the 'Arrays' section of the PARAMETERS chapter in the bash man page, and then the 'Parameter Expansion' section of the "EXPANSION" chapter. There are a number of array-specific parameter expansions I haven't covered here, for example array slicing with ${varname[@]:offset:length}. I won't cover those exhaustively; hopefully after reading this, you have a sense of where to start in the man page itself.
Happy indexing!
bruh how are you this fast and this detailed tho
@@la.zanmal. Atomoxetine and ADHD hyperfocus, mostly
@@extrageneity neurodivergents unite 🫡
Very useful script. You missed a couple of echo (for style) on L:49 and L:97. Would you mind a pull request? I recently started learning git for managing my dotfiles, but I have yet to push anything in GitHub, I could try it.
How much chatgpt (if any) are you leveraging for these comments? i've been curious for a bit and have never asked. I ask now because sometimes I find the shift in tone in your comments from a directed style of communication to a more passive-voice to be surprising, and also a lot of the comment end with "Happy _____!" which i see *a lot* when dealing with LLMs.
LET ME BE CLEAR - i don't ask in a weird snide/passive-aggressive way - i'm literally genuinely curious if these comments are assisted at all, or are 100% your own creation - because EITHER WAY i'm impressed man lmao. These show some serious dedication to the "extra" in your name, as you've said, haha and i love them.
I'm gonna shout-out these comments in a video on insta/tiktok to let people know they can go find further reading / the appendix in my comments section 😎 i love the dedication bro.
Super information dense and spoken very clearly, awesome video!
Perfect time to see this on my recommended page, as I am struggling with a bash array at this very moment.
i got u
I like the energy, subscribed
Great video! But my eyes suck... I would love to see LARGER TEXT. :) thank you!!
Low contrast themes also make it difficult to see in a brighter setting. :)
great channel man! would love to see more strictly POSIX-compliant stuff from you as well (i.e. /bin/sh --> dash)
i’ve been thinking about this tbh
If you're using arrays in bash, stop, use a real language, then continue
That's why I do my scripts in rust
There are reasons to use arrays in bash even when doing things which are purely native to bash. For example, say you have a script which is doing something like: command1 | command2 | command3
... and your exception handler needs to look not just at the output of the pipeline, but at exit codes of each command.
To do that, you can inspect ${PIPESTATUS[0]}. What is PIPESTATUS? A bash array!
Say you want to save the PIPESTATUS array off to a different variable, because in the course of inspecting PIPESTATUS and output from that command you're going to run a different pipeline.
_my_original_pipestatus=( "${PIPESTATUS[@]}" )
Congratulations, you just initialized a new array that won't be blown away the next time you run a pipe.
There are plenty of reasons to switch from bash to another language. Needing arrays, or even associative arrays, shouldn't be the threshold.
woah woah woah… bash is absolutely a real language.
@mattymattffs
A stochastic parrot seen in real life! I can't believe it! Such a non-rare sight!
At least quote Primeagen when it comes to it man.
Arrays in bash have their uses if you understand them, you just have to be a sufficiently advanced shell user to leverage them correctly.
Of course, it's only with experience and (not necessarily) niche use cases that they should be used in.
For example:
What if you are a red teamer and you discovered a bunch of hosts that don't have internet and nothing installed, except bash. You know those hosts can communicate with an internal git server, which you gained control over. You manage to manually deploy a service on these machines that'll create a branch on the git server where the name of the branch is the hostname of the machine.
Now, on your attacker machine, you are writing a c2 interface to control all of these machines via the branches as a communication channel. Of course, you want a mechanism to control machines you've "infected". With git-branch you can list all the branches, but you want them in a more manageable format, hm, maybe an ARRAY! Of course, since you know you bash, you know that the perfect command for that is
mapfile -t machines <
@@cubernetes no
I managed 1d arrays but I was trying to do a 2d array with with 5 strings per element. No problem with Python or C, but bash I can't do it. Is it even possible?
bash doesn't have 2-dimensional arrays, but it does have ways in parameter expansion of referencing variable names indirectly, which means that you can bodge it together yourself. Typically when I've needed something like that I would just build up 5 associative arrays, 1 for each of the 5 strings I needed, and have the key in each of those associative arrays be the same. There are even ways using indirect variable references to do this for sets of arbitrary length.
yeah, no nested data structures in bash sadly
@@extrageneity I don't like to be be beaten, but I think I have to resign myself to awk being the better option. Thanks though.
@@pldvs I love awk and often pipe to it from bash when I need multi-dimensional arrays. Use it proudly. :-)
agree with @extrageneity - use the right tool for the job and use it proudly - awk is AWESOME
BTW ... I use Arch.
Now do a red-black tree
what os you using?
Great video I like using files in a array in bash but I awk to print $1. You have a better solution?
i'm not sure what you men exactly
@@yousuckatprogramming i meant a array in a file for installing multpile packages at a time. Instead of list i use a file with the packages names in it.
@@yousuckatprogramming I figured it out I use while with read line
But "${arr[@]}" and "${arr[*]}" are not the same as ${arr[@]} and ${arr[*]}, right?
Why is anyone still using bash when zsh exists?