HTTP Data Source Puller
The HTTP Data Source Puller is a fundamental component within DatErica for integrating external data into your pipelines. It fetches data from specified APIs and is adaptable for various data retrieval scenarios.
How to Configure and Use
1. Setting Up Your Data Source
-
Specify the HTTP Method and the URL for the API endpoint from which you want to fetch data. These fields are crucial as they define the request's nature and the targeted resource.
-
Add any necessary Headers, Body content (for methods such as POST or PUT), and adjust Advanced Settings according to the requirements of your API.
2. Executing the API Request
After configuring your data source hit the Test Send button to execute the API call. This will trigger a backend process that sends the request to your specified API and retrieves the response.
3. Reviewing the Result Data Structure
Following a successful API call the Result Data Structure section will display the response from the API, allowing you to verify the data structure and ensure it meets your requirements.
4. Response Payload Configuration
Once you have the results - use the Response Payload Config section to set the 'Response payload root property path.' This configuration defines the specific part of the API response that should be used by the pipeline. For example, if the response is an object that contains a data
key, which then contains the relevant array or object, you would specify response > data
as the root path.
Required Configuration
Request Details
- HTTP Method: Essential for defining the operation to be performed on the data, such as GET, POST, PUT.
- URL: The endpoint from which data will be retrieved. This field is mandatory.
Optional Configuration
These settings are not mandatory but provide additional control over the data fetching process:
Body
Available only for methods like POST or PUT, this section allows you to send a payload with your request.
Headers
Customize request headers to include API keys, bearer tokens, or other necessary details.
Advanced Settings
- Tailor the request further with options like:
- Response Type: Expected format of the response, typically JSON, but other formats can be specified based on the API response.
- Response Encoding: The character encoding of the response, usually utf-8, ensuring correct text interpretation.
- Timeout: The time in milliseconds to wait for a server response. If the response is not received within this time, the request will be aborted.
- Max Redirects: The maximum number of server redirects that will be followed before giving up on the request.
- Max Content Length: The maximum size of the HTTP response content in bytes. If the content exceeds this length, the request will be aborted.
- Max Body Length: The maximum size of the HTTP request body in bytes. This is relevant for methods that send data (like POST or PUT) and ensures large payloads do not exceed set limits.
- Decompress: A boolean indicating whether the response body should be decompressed automatically or if you wish to handle decompression manually.
Health Check
Health checks are critical for ensuring the API's reliability and proper functionality. You can define expectations for the response, and if these are not met, an alert will be triggered:
-
Status Code Check: Set specific rules to validate the HTTP status code of the response. You can choose from options such as 'Equals', 'Not Equals', 'Less Than', 'Less Than Equals', 'Greater Than', 'Greater Than Equals', and 'Between'. For example, selecting 'Between' with values 200 to 299 will ensure that any status code outside this range triggers an alert.
-
Response Time Check: Define a maximum response time (in milliseconds). If the API takes longer than this time to respond, it will trigger an alert. This setting is essential for performance monitoring.
-
Headers Check: Assess the API response's headers based on selected conditions. The options available are 'Exists', 'Not Exists', 'Equals', 'Not Equals', 'Includes', and 'Not Includes'. This is particularly useful for confirming the presence of certain security or session headers.
Each of these settings determines what is considered a 'healthy' response from your API. If the actual response falls outside of these configured parameters, DatErica will notify you according to your alert settings.
Retry Policy
- Determine the number of retry attempts (0-3) and delay between retries in milliseconds, enhancing the robustness of the data fetching process.