I am trying to programmatically access clinical information for multiple donors in multiple projects using the ICGC API. The goal is to filter the donors by a specific cancer type, in this case, Cholangiocarcinoma. I have written a bash script to retrieve the donor IDs, but I encountered an issue when some projects have more than 100 donors, resulting in multiple pages of results.
Here is the script I have so far:
#!/bin/bash
# Define the list of projects
projects=("BTCA-SG" "BTCA-JP" "LIHC-US" "LIRI-JP")
# Loop through the projects and generate a list of donors
for project in "${projects[@]}"; do
page=1
while true; do
result=$(curl -X GET --header 'Accept: application/json' "https://dcc.icgc.org/api/v1/projects/${project}/donors?size=1000&page=${page}" | jq -r '.hits[].id')
if [[ -z "$result" ]]; then
break
fi
echo "$result"
page=$((page+1))
done
done > donor_ids.txt
I noticed that changing the page parameter in the API request does not affect the returned results. I expected that changing the page parameter should retrieve the next page of results, but it seems to be returning the same set of results regardless of the page value. I would appreciate any guidance on how to properly handle pagination and retrieve all the donor IDs for the specified projects.
Thank you in advance for your help!
I am trying to programmatically access clinical information for multiple donors in multiple projects using the ICGC API. The goal is to filter the donors by a specific cancer type, in this case, Cholangiocarcinoma. I have written a bash script to retrieve the donor IDs, but I encountered an issue when some projects have more than 100 donors, resulting in multiple pages of results.
Here is the script I have so far:
I noticed that changing the
pageparameter in the API request does not affect the returned results. I expected that changing thepageparameter should retrieve the next page of results, but it seems to be returning the same set of results regardless of thepagevalue. I would appreciate any guidance on how to properly handle pagination and retrieve all the donor IDs for the specified projects.Thank you in advance for your help!