GFQL Endpoint
The GFQL (graph dataframe query language) endpoint is a Graphistry server HTTP endpoint for securely dispatching GFQL queries to run over data already uploaded into your server and getting back the results. See 10 mins to GFQL for more information about GFQL and how it can be used.
On a per-user basis, access to the GFQL endpoint can also controlled using the `flag_gfql_endpoint` Waffle flag set within Graphistry.
Endpoint
POST /api/v2/etl/datasets/<dataset_id>/gfql/<output_type>
This endpoint accepts requests to perform graph query operations and returns graph data.
Authentication
Requests to this endpoint require a valid JWT token passed as a Bearer token in the Authorization header, or are authenticated by a CSRF token included with the cookies sent with a request.
Authorization: Bearer <your_jwt_token>
Request Parameters
URL Parameters:
dataset_id
(required): String containing a unique identifier of the dataset to query.output_type
-all
,nodes
,edges
,shape
(optional, defaults toall
): String which changes output contents. Specifynodes
oredges
respectively for the corresponding property table. Specifyshape
for the node and edge table row/column counts without returning the actual result tables. Leave blank or specifyall
to return the full graph, meaning both the node and edge tables. If the output format isjson
, then an array containing only a node table or edge table is returned rather than an object containing both. If the output format is of a type returning binary files, thenall
triggers returning a zip file of the corresponding format files.
Body Parameters (JSON):
Parameter | Type | Default | Description |
---|---|---|---|
gfql_operations | Object[] |
- | Array of operation objects used to define a sequence of graph traversals. Each operation object defines a token in the match pattern used to traverse the graph. See GFQL operations for more information on how to specify a single operation. Operations can be performed on elements matched in the sequence. In addition a subgraph is extracted from the graph composed of all elements traversed. If left empty the entire graph is retrieved, which can be used to download datasets. |
format | string |
json |
String which specifies the format used to encode returned graph data. Valid strings include json , csv , or parquet . If both nodes or edges are requested, the data is returned inside a zip file containing a nodes and an edges file, otherwise the raw text or binary is returned. |
node_col_subset | string[] |
- |
Array of strings which specifies the subset of node properties to be returned as results, using the property names as identifiers. This array can also be used to change the order the properties are stored in the returned data. |
edge_col_subset | string[] |
- |
Array of strings which specifies the subset of edge properties to be returned as results, using the property names as identifiers. This array can also be used to change the order the properties are stored in the returned data. |
df_export_args | Object |
- |
Object containing arguments to be passed to the function serializing the result dataframe. The specific arguments permitted vary depending on the format of data requested. If the format requested is json , then valid arguments include engine , orient , date_format , double_precision , force_ascii , date_unit , and index . If the format requested is csv , then valid arguments include sep , na_rep , header , index , and encoding . If the format requested is parquet , then valid arguments include compression , and index . See linked documents for more information on arguments passed. |
engine | cudf , pandas |
Infer | Whether to load the graph and query it in the GPU (cudf ) or CPU (pandas ). When unspecified, graph is inspected to determine where to run it, such as by examining file size. |
Response
The response will depend on the format specified in the body parameters and the output type specified in the URL. In all cases the response will contain graph data from the GFQL query result.
Simple GFQL query returning JSON
Request
curl -X POST "http://<host>/api/v2/etl/datasets/<dataset_id>/gfql/" \
-H "Authorization: Bearer <your_jwt_token>" \
-H "Content-Type: application/json" \
-d '{"gfql_operations":[{"type": "Node", "filter_dict":{"name": "Alice"}}, {"type": "Edge", "direction": "forward", "hops": 1}]}'
Response
{
"nodes": [
{"id": "1", "name": "Alice", "age": 30},
],
"edges": [
{"source": "1", "target": "2", "relation": "knows"}
]
}
GFQL query returning nodes only using CSV
Request
curl -X POST "http://<host>/api/v2/etl/datasets/<dataset_id>/gfql/nodes" \
-H "Authorization: Bearer <your_jwt_token>" \
-H "Content-Type: application/json" \
-d '{"gfql_operations":[{"type": "Node", "filter_dict":{"name": "Alice"}}, {"type": "Edge", "direction": "forward", "hops": 1}], "format": "csv"}'
Response
id,name,age
1,Alice,30
GFQL query returning edges as a parquet
Request
curl -X POST "http://<host>/api/v2/etl/datasets/<dataset_id>/gfql/edges" \
-H "Authorization: Bearer <your_jwt_token>" \
-H "Content-Type: application/json" \
-d '{"gfql_operations":[{"type": "Node", "filter_dict":{"name": "Alice"}}, {"type": "Edge", "direction": "forward", "hops": 1}], "format": "parquet"}'
Response
Binary parquet fileGFQL query returning shape of results without contents
Request
curl -X POST "http://<host>/api/v2/etl/datasets/<dataset_id>/gfql/shape" \
-H "Authorization: Bearer <your_jwt_token>" \
-H "Content-Type: application/json" \
-d '{"gfql_operations":[{"type": "Node", "filter_dict":{"name": "Alice"}}, {"type": "Edge", "direction": "forward", "hops": 1}], "format": "json"}'
Response
[ { "kind": "nodes", "rows": 1, "columns": 3 }, { "kind": "edges", "rows": 1, "columns": 2 } ]
GFQL query returning shape of results without contents
Notes
- Ensure the JWT token used for authentication has sufficient permissions to access the dataset.
- The
gfql_operations
parameter allows for significant flexibility in query construction; refer to the GFQL specification for detailed usage.