0

I decided to write a little bash parser script I kinda succ in bash and jq. So I using curl to get json from reddit and jq to extract value from it, I want get titles as list of sentences what is the best way to get it?

Code example

#getting title
titles=($(echo "${json}" | jq '.data.children[].data.title'))
echo "full list is"
echo ${titles[@]}
echo

#copyed by hand from previos output^ 
hand_titles=("Developers Should Celebrate Software Development Being Hard" "Lies we tell ourselves to keep using Golang")

echo "I want to call var like this and get this output:"
echo ${hand_titles[0]}
echo
echo "But instead I get this: "
echo ${titles[0]}

Console output

full list is
"Developers Should Celebrate Software Development Being Hard" "Lies we tell ourselves to keep using Golang"

I want to call var like this and get this output:
Developers Should Celebrate Software Development Being Hard

But instead I get this:
"Developers

I want to use a for loop to Iterate trough list in parallel and use ${titles[i]} and for this I need output a sentence "Developers Should Celebrate Software Development Being Hard" not a damn word

Maybe I suppose record it to file or something then read it to use it properly I dunno

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
Vitalcion
  • 1
  • 1
  • 1
    `array=( $(anything) )` is an antipattern; see [BashPitfalls #50](https://mywiki.wooledge.org/BashPitfalls#hosts.3D.28_.24.28aws_.2BICY.29_.29). – Charles Duffy Dec 31 '22 at 15:08
  • Please add a suitable shebang (`#!/bin/bash`) and then paste your script at http://www.shellcheck.net/ – Cyrus Dec 31 '22 at 15:12
  • ...and to understand the root cause of the problem, see [BashFAQ #50](https://mywiki.wooledge.org/BashFAQ/050). The very short explanation of the problem is that literal and syntactic quotes are _not the same thing_ and don't substitute for each other; quotes that are _part of your data_ don't act like quotes that are part of your code, and _shouldn't_ -- if they did, it would be impossible to write correct/safe code in shell languages that handled untrusted data. – Charles Duffy Dec 31 '22 at 15:13
  • BTW, `echo ${titles[@]}` acts **exactly** like `echo ${titles[*]}`; it has no pretense at all of keeping the boundaries between your individual elements intact. _Always_ quote your expansions -- and, when you want to know the boundaries, don't use `echo` at all; `printf '<%s> ' "${titles[@]}"; echo` tells you where each individual title starts and stops so you can see if your array is correct. (Then again, so does `declare -p titles`). – Charles Duffy Dec 31 '22 at 15:20

2 Answers2

1

Assuming your titles can't contain literal newlines (well, literal after decoding, \n within JSON strings), the easy way to do this is:

readarray -t titles < <(jq -r '.data.children[].data.title' <<<"$json")
echo "full list is"
printf ' - %s\n' "${titles[@]}"
echo "First title is ${titles[0]}"

readarray (also called mapfile) is a bash 4.0 feature that reads each line of input into a separate array element; using jq -r makes jq's output line-oriented without extra JSON quoting/escaping.


If they can contain newlines, it gets a little trickier:

readarray -d '' titles < <(
  jq -j '.data.children[].data.title | (., "\u0000")' <<<"$json"
)
echo "full list is"
printf ' - %s\n' "${titles[@]}"
echo "First title is ${titles[0]}"

-d '' tells readarray to expect items to be NUL-terminated; -j tells jq to do raw output but not append a newline automatically after each item; (., "\u0000") manually adds those NUL terminators. (If you're dealing with data that's going to be interpreted on the other side of a privilege boundary from that data's source, think about stripping any NULs inside the JSON before adding new/extra ones as separators; I've been known to put something like sub("\u0000"; "<NUL>") inside my pipelines).


In both the examples above, note how above we're printing each array element on its own line to demonstrate that the items were held together correctly.

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
0

You could make jq generate the body of a declare statement:

$ unset titles
$ declare -a titles="($(jq -r '.data.children[].data.title | @sh' <<< "$json"))"

Then, you can use the resulting Bash array:

$ echo "${titles[0]}"
Developers Should Celebrate Software Development Being Hard
pmf
  • 24,478
  • 2
  • 22
  • 31
  • This _works_, and is safe because you're using `@sh` to generate eval-safe escaping, but insofar as you're depending on `declare` parsing data as code, it's not something I'd defend as good practice (as soon as someone uses it in a context that doesn't have any equivalent to that `@sh`, we've got shell injection vulnerabilities). – Charles Duffy Dec 31 '22 at 15:30
  • @CharlesDuffy True. Unless there is a safe, dedicated interface for this (or any kind of) job, every workaround has its own caveats. – pmf Dec 31 '22 at 15:33