GFQL Endpoint

The GFQL (graph dataframe query language) endpoint is a Graphistry server HTTP endpoint for securely dispatching GFQL queries to run over data already uploaded into your server and getting back the results. See 10 mins to GFQL for more information about GFQL and how it can be used.

On a per-user basis, access to the GFQL endpoint can also controlled using the `flag_gfql_endpoint` Waffle flag set within Graphistry.

Endpoint

POST /api/v2/etl/datasets/<dataset_id>/gfql/<output_type>

This endpoint accepts requests to perform graph query operations and returns graph data.

Authentication

Requests to this endpoint require a valid JWT token passed as a Bearer token in the Authorization header, or are authenticated by a CSRF token included with the cookies sent with a request.

Authorization: Bearer <your_jwt_token>

Request Parameters

URL Parameters:

  • dataset_id (required): String containing a unique identifier of the dataset to query.
  • output_type (optional): String which changes output to contain the result node-list or edge-list only. Specify using nodes or edges respectively. If the output format is json, then an array containing only a node-list or edge-list is returned rather than an object containing both. If the output format is of a type returning binary files, only the node file or the edge file is returned.

Body Parameters (JSON):

Parameter Type Default Description
gfql_operations Object[] - Array of operation objects used to define a sequence of graph traversals. Each operation object defines a token in the match pattern used to traverse the graph. See GFQL operations for more information on how to specify a single operation. Operations can be performed on elements matched in the sequence. In addition a subgraph is extracted from the graph composed of all elements traversed. If left empty the entire graph is retrieved, which can be used to download datasets.
format string json String which specifies the format used to encode returned graph data. Valid strings include json, csv, or parquet. If both nodes or edges are requested, the data is returned inside a zip file containing a nodes and an edges file, otherwise the raw text or binary is returned.
node_col_subset string[] - Array of strings which specifies the subset of node properties to be returned as results, using the property names as identifiers. This array can also be used to change the order the properties are stored in the returned data.
edge_col_subset string[] - Array of strings which specifies the subset of edge properties to be returned as results, using the property names as identifiers. This array can also be used to change the order the properties are stored in the returned data.
df_export_args Object - Object containing arguments to be passed to the function serializing the result dataframe. The specific arguments permitted vary depending on the format of data requested. If the format requested is json, then valid arguments include engine, orient, date_format, double_precision, force_ascii, date_unit, and index. If the format requested is csv, then valid arguments include sep, na_rep, header, index, and encoding. If the format requested is parquet, then valid arguments include compression, and index. See linked documents for more information on arguments passed.

Response

The response will depend on the format specified in the body parameters and the output type specified in the URL. In all cases the response will contain graph data from the GFQL query result.

Simple GFQL query returning JSON

Request
curl -X POST "http://<host>/api/v2/etl/datasets/<dataset_id>/gfql/" \
 -H "Authorization: Bearer <your_jwt_token>" \
 -H "Content-Type: application/json" \
 -d '{"gfql_operations":[{"type": "Node", "filter_dict":{"name": "Alice"}}, {"type": "Edge", "direction": "forward", "hops": 1}]}'
Response
{
  "nodes": [
    {"id": "1", "name": "Alice", "age": 30},
  ],
  "edges": [
    {"source": "1", "target": "2", "relation": "knows"}
  ]
}

GFQL query returning nodes only using CSV

Request
curl -X POST "http://<host>/api/v2/etl/datasets/<dataset_id>/gfql/nodes" \
 -H "Authorization: Bearer <your_jwt_token>" \
 -H "Content-Type: application/json" \
 -d '{"gfql_operations":[{"type": "Node", "filter_dict":{"name": "Alice"}}, {"type": "Edge", "direction": "forward", "hops": 1}], "format": "csv"}'
Response
id,name,age
1,Alice,30

Notes

  • Ensure the JWT token used for authentication has sufficient permissions to access the dataset.
  • The gfql_operations parameter allows for significant flexibility in query construction; refer to the GFQL specification for detailed usage.