REST API Documentation: GFQL Endpoint

Python Endpoint

The Python endpoint is a Graphistry server endpoint which allows code written in Python to be run server side directly. This allows a user to take advantage of computational resources (particularly GPU resources) available on the Graphistry server using cuDF, pygraphistry, and other libraries. It can also be used to perform reduce operations on large datasets without incurring the cost of transfering all the data back to the client.

This feature is currently only available to enterprise users. Access to the Python endpoint is controlled by two feature flags set within Graphistry, the flag_python_endpoint_disabled feature flag which must be set to False, and the flag_python_endpoint_enabled_users feature flag which must be set to True. Both conditions must be met for any given user for access to be granted. If an administrator wants to disable Python endpoint completely, overriding the above feature flags, an environment variable can be set. See Configuring Graphistry for more information on this option.

Endpoint

POST /api/v2/datasets/<dataset_id>/python

This endpoint accepts requests to execute Python code, returning data in either string or JSON form.

Authentication

Requests to this endpoint require a valid JWT token passed as a Bearer token in the Authorization header, or are authenticated by a CSRF token included with the cookies sent with a request.

Authorization: Bearer <your_jwt_token>

Request Parameters

URL Parameters:

dataset_id (required): String containing a unique identifier of the dataset to pass into the executed Python function.

Body Parameters (JSON):

Parameter	Type	Default	Description
execute	`string`	-	String containing valid Python code which can be executed server-side. The endpoint expects that the Python code string will contain a function `task` which will be executed. This function is required to accept a single argument, which will be passed the Graphistry plottable object containing the dataset specified previously. The endpoint expects that this function will return either a Python `str` or a JSON serializable `dict` which will be returned to the user. There are currently no restrictions on the code which can be run using the Python endpoint. Import statements can be used to import all standard Python libraries in addition to a handful of other libraries, including but not limited to `graphistry`, `cudf`, `numpy`, and `pandas`. Warning: there are currently no safeguards around the endpoint such as timeouts and multithreading, bad execute requests may require the container to be restarted. Any uncaught errors in the executed code will be returned via a JSON error message.
engine	`cudf`, `pandas`	`cudf`	Whether to load the graph and query it in the GPU (`cudf`) or CPU (`pandas`).
output_type	`all`, `json`, `edges`, `nodes`, `table`, `shape`	`json`	What to return, using the data representation specified in accompanying parameter format If the Python code returns a `Plottable`, return all or part of the graph: `"all"`: A zip file with files for the nodes and edges `"json"`: A JSON object with the nodes and edges `"edges"`: The edges table `"nodes"`: The nodes table `"shape"`: The row, column counts of the node and edge tables When the Python is expected to return a dataframe-like (e.g., `pandas.DataFrame`, `cudf.DataFrame`, or `arrow.Table`): `"table"`: The table `"shape"`: The row, column counts of the table For arbitrary returns, use `"json"`.
format	`json`, `csv`, `parquet`	`json`	What representation to represent the returned data.

Response

The response will depend on the type of the data returned from the task function provided, and can either be JSON or string.

Simple Python query returning the full serialized dataset

Request

curl -X POST "http://<host>/api/v2/datasets/<dataset_id>/python" \
 -H "Authorization: Bearer <your_jwt_token>" \
 -H "Content-Type: application/json" \
 -d "
{
    \"execute\": \" \
        def task(g): \
            return { \
                'nodes': g._nodes.to_dict(orient='records') if g._nodes is not None else [], \
                'edges': g._edges.to_dict(orient='records') if g._edges is not None else [] \
            } \
    \"
}
"

Response

{
  "nodes": [
    {"id": "1", "name": "Alice", "age": 30},
  ],
  "edges": [
    {"source": "1", "target": "2", "relation": "knows"}
  ]
}