[sim2_daily] Implement API Data Pagination
Okay, incorporating the provided Plumber header information, here's an enhanced issue description:
sim2_daily
API Endpoint
Issue: Implement Pagination for Summary: This issue addresses the need to implement pagination for the /sim2_daily
API endpoint to improve performance when querying the SIM2 daily database, especially for large rectangular regions. The endpoint currently returns data in CSV format and is used to access data from the Meteo-France dataset (https://www.data.gouv.fr/fr/datasets/6569b27598256cc583c917a7/).
Background: The /sim2_daily
API allows users to query a rectangular region of the SIM2 daily database based on Lambert II coordinates (LAMBX, LAMBY) and an optional date range. Given the potential size of the data within a given region, returning the entire result set at once can lead to performance issues and client-side memory constraints.
Proposed Solution: Implement pagination to allow clients to retrieve data in smaller, manageable chunks (pages). This will significantly improve performance and scalability.
Detailed Implementation:
-
API Parameters: Add
offset
andlimit
parameters to the API.-
offset
: Integer representing the starting row number (0-indexed). -
limit
: Integer representing the maximum number of rows to return per page.
-
-
Data Filtering & Slicing: Modify the API logic to:
- Apply the existing filtering criteria (LAMBX, LAMBY, DATE).
- Use
slice()
function to retrieve the correct set of rows based onoffset
andlimit
.
-
Metadata Delivery: Alongside the CSV data, provide the following pagination metadata. Because the API is serialized as CSV, the metadata must be delivered via HTTP Headers, not within the CSV itself.
To dynamically add pagination data (e.g., X-Total-Count
, X-Page
, or even a Link
header) when using a serializer_csv()
in Plumber, the best practice is to modify the response headers dynamically inside your endpoint function, based on your data and pagination logic.
Here’s how you can do it step-by-step:
🔧 1. Example with dynamic pagination headers
library(plumber)
#* Paginated CSV endpoint
#* @get /paginated-csv
#* @serializer csv
function(req, res, page = 1, per_page = 10) {
# Simulate data
full_data <- data.frame(
id = 1:100,
name = paste("Item", 1:100)
)
# Pagination logic
page <- as.integer(page)
per_page <- as.integer(per_page)
start <- ((page - 1) * per_page) + 1
end <- min(nrow(full_data), start + per_page - 1)
# Subset data
paginated_data <- full_data[start:end, , drop = FALSE]
# Set pagination headers
res$setHeader("X-Total-Count", nrow(full_data))
res$setHeader("X-Page", page)
res$setHeader("X-Per-Page", per_page)
# Optional: Add Link header for navigation
base_url <- req$rook$url_scheme %||% "http"
host <- req$HTTP_HOST %||% "localhost"
path <- req$PATH_INFO
next_page <- page + 1
prev_page <- max(1, page - 1)
link_header <- sprintf(
'<%s://%s%s?page=%d&per_page=%d>; rel="next", <%s://%s%s?page=%d&per_page=%d>; rel="prev"',
base_url, host, path, next_page, per_page,
base_url, host, path, prev_page, per_page
)
res$setHeader("Link", link_header)
return(paginated_data)
}
📥 Response headers (example)
X-Total-Count: 100
X-Page: 2
X-Per-Page: 10
Link: <http://localhost/paginated-csv?page=3&per_page=10>; rel="next", <http://localhost/paginated-csv?page=1&per_page=10>; rel="prev"
✅ Notes
- You can use these headers for API consumers to manage pagination.
- If you're exporting this as a downloadable CSV (
filename="data.csv"
), you can still set these headers, and they'll be included in the HTTP response (but not in the CSV content itself). - You can wrap this pattern in a helper function to generalize it across routes.
Would you like a reusable helper for pagination handling in Plumber?
Acceptance Criteria:
- The API accepts
offset
andlimit
parameters. - The API returns the correct number of rows for a given
offset
andlimit
. - The API delivers pagination metadata via HTTP headers as described above.
- The API correctly handles cases where
offset
+limit
exceeds the total number of matching rows. - Performance is demonstrably improved for large datasets when using pagination.
- The API documentation (including the Plumber header) is updated to reflect the new parameters and HTTP Headers.