This tutorial creates cURL bash script to directly work with the Graphistry REST API for generating a visualization from CSV files:

  • Swap in other supported file formats like Arrow, JSON, ORC, Parquet, and XLS by changing the parts that say CSV to your desired format
  • Send compressed data by ensuring the compression type is in the filename (ex: data.csv.gz) and specifying parameter file_compression (ex: "gzip"
  • Use your language instead of bash by swapping in its HTTP library, such as JavaScript's Axios library and Python's request library
  • ... Or see a language-native library if available, which wraps this functionality

You can see the combined script in curlTutorial.sh.

Concept

After we setup our project, the upload script will perform the following API calls:
  1. Get a short-lived JWT token for a REST API session
  2. Create and populate a File for your edges data
  3. Create and populate a File for your nodes data (optional)
  4. Create a visualization Dataset that uses your uploaded Files and a custom JSON configuration

1. Preparation: JQ, data, and credentials

Install JQ for easy command-line manipulation of JSON responses from the REST API. This step is optional, but recommended for command-line use.

Create empty file uploader.sh, give it execute permissions (chmod +x uploader.sh), and fill in your credentials:

#!/bin/bash
set -e  # terminate on fail

# Replace testuser, testpwd, https://hub.graphistry.com as appropriate
# For a free public API account, visit https://hub.graphistry.com and use an email signup
GRAPHISTRY_USERNAME=${GRAPHISTRY_USERNAME:-testuser}
GRAPHISTRY_PASSWORD=${GRAPHISTRY_PASSWORD:-testpwd}
GRAPHISTRY_BASE_PATH=${GRAPHISTRY_BASE_PATH:-https://hub.graphistry.com}

Download a sample graphistry_edges.csv and graphistry_nodes.csv. Alternatively, the reference curlTutorial.sh generates these files for you:

#graphistry_edges.csv
s,d,txt,num
a,b,the,2
b,c,quick,4
c,a,brown,6

#graphistry_nodes.csv
n,v,v2
a,2,a
b,4,aa
c,6,aaa

Your directory should now look like:

ls
uploader.sh  graphistry_edges.csv  graphistry_nodes.csv

2. Get a short-lived JWT token for our REST API session

Graphistry supports JWT tokens for secure API use. The script will use an account's credentials to get a JWT session token. Subsequent API calls will include the JWT token for automatic authentication and authorization.

Whoever gets a JWT token gets authorization to work with the corresponding user account: Carefully consider where tokens get transmitted in your application.

  • Server app: A server app might use a single Graphistry "service" account and thus would want to avoid exposing its JWT token to its many browser-based users. Instead, it would make calls on behalf of the user, and send the generated embeddable visualization URLs to them.
  • Client app: A thick client app might not even have a server. In this case, the user might enter their username/password, which a browser-side JavaScript app can send directly to the Graphistry server to get a JWT token.

The API calls are the same in both cases, so this tutorial applies to both cases.


echo "1. Generate JWT token -- POST /api-token-auth/"
OUT=$( curl -fsv -X POST \
    -H "Content-Type: application/json" \
    -d "{\"username\": \"${GRAPHISTRY_USERNAME}\", \"password\": \"${GRAPHISTRY_PASSWORD}\"}" \
    ${GRAPHISTRY_BASE_PATH}/api-token-auth/ )
echo $OUT | jq .
export GRAPHISTRY_TOKEN=$( echo "$OUT" | jq -jr .token )
Result:
{
    "pk": 1613158536,
    "user": {
        "name": "user",
        "username": "user",
        "email": "user@user.ngo",
        "id": 1
    },
    "token": "MY_TOKEN123"
}
We used JQ to extract and store variable GRAPHISTRY_TOKEN=MY_TOKEN123.

3. Create and populate an edge File

First, we create a File object for our graphistry_edges.csv with intended parsing hints. This is via call POST /api/v2/files/. For bigger files, we recommend orc/parquet, and in general, typed formats, like orc/parquet/arrow. You can also used an untyped format like CSV and provide type hints called dtypes. Graphistry is part of the NVIDIA RAPIDS.ai Python OSS ecosystem, so the APIs and documentation are written to line up.

Second, we send data to populate the file via POST /api/v2/upload/files/MY_FILE_ID?erase=true. Upon success, this returns a file_id. Optional parameter erase controls whether the data gets deleted upon failure to parse. A data cleaning UI may want to set erase=false if it provides UI controls for letting the user experiment with changing the file parsing settings (PATCH /api/v2/upload/files/MY_FILE_ID/) without having to reupload the file.

OUT=$( curl -fsv -X POST \
    -H "Authorization: Bearer ${GRAPHISTRY_TOKEN}" \
    -H "Content-Type: application/json" \
    -d '{"file_type": "csv", "name": "my edges", "file_compression": "" }' \
    ${GRAPHISTRY_BASE_PATH}/api/v2/files/ )
echo $OUT | jq .
export EDGES_FILE_ID=$( echo "$OUT" | jq -jr .file_id )

OUT=$( curl -fsv -X POST \
    -H "Authorization: Bearer ${GRAPHISTRY_TOKEN}" \
    -H "Content-Type: application/json" \
    -T edges.csv \
    ${GRAPHISTRY_BASE_PATH}/api/v2/upload/files/${EDGES_FILE_ID}?erase=true )
echo $OUT | jq .
Result:
{
    "created_at": "2021-02-12T19:35:36.974870Z",
    "updated_at": "2021-02-12T19:35:36.974911Z",
    "agent_name": "",
    "agent_version": "",
    "author": 123,
    "bytes_count": null,
    "description": "",
    "file_id": "MYFILE_456",
    "name": "my edges",
    "file_compression": "",
    "file_type": "csv",
    "parser_options": {},
    "tables_schemas": {},
    "sql_transforms": null,
    "is_uploaded": false,
    "is_valid": false,
    "errors": []
  }
  
  {
    "agent_name": "",
    "agent_version": "",
    "author": 123,
    "bytes_count": 4,
    "created_at": "2021-02-12T19:35:36.974870Z",
    "description": "",
    "errors": {},
    "file_compression": "",
    "file_id": "MYFILE_456",
    "file_type": "csv",
    "is_uploaded": true,
    "is_valid": true,
    "name": "my edges",
    "parser_options": {},
    "sql_transforms": null,
    "tables_schemas": [
      {
        "dtypes": {
          "d": "object",
          "num": "int64",
          "s": "object",
          "txt": "object"
        },
        "name": "Untitled 0",
        "num_cols": 4,
        "num_rows": 3
      }
    ],
    "updated_at": "2021-02-12T19:35:36.974911Z"
}

We used JQ to extract and store variable EDGES_FILE_ID=MYFILE_456.

4. Create and populate a node File

We repeat step 3, except now for nodes.csv.

If you do not have a nodes file, you can skip this step. Graphistry will automatically create nodes for you.

OUT=$( curl -fsv -X POST \
    -H "Authorization: Bearer ${GRAPHISTRY_TOKEN}" \
    -H "Content-Type: application/json" \
    -d '{"file_type": "csv", "name": "my nodes", "file_compression": "" }' \
    ${GRAPHISTRY_BASE_PATH}/api/v2/files/ )
echo $OUT | jq .
export NODES_FILE_ID=$( echo "$OUT" | jq -jr .file_id )

OUT=$( curl -fsv -X POST \
    -H "Authorization: Bearer ${GRAPHISTRY_TOKEN}" \
    -H "Content-Type: application/json" \
    -T graphistry_nodes.csv \
    ${GRAPHISTRY_BASE_PATH}/api/v2/upload/files/${NODES_FILE_ID}?erase=true )
echo $OUT | jq .

The result is similar to as before:

{
    "created_at": "2021-02-12T19:35:37.652643Z",
    "updated_at": "2021-02-12T19:35:37.652678Z",
    "agent_name": "",
    "agent_version": "",
    "author": 123,
    "bytes_count": null,
    "description": "",
    "file_id": "MYFILE_789",
    "name": "my nodes",
    "file_compression": "",
    "file_type": "csv",
    "parser_options": {},
    "tables_schemas": {},
    "sql_transforms": null,
    "is_uploaded": false,
    "is_valid": false,
    "errors": []
}

{
    "agent_name": "",
    "agent_version": "",
    "author": 123,
    "bytes_count": 4,
    "created_at": "2021-02-12T19:35:37.652643Z",
    "description": "",
    "errors": {},
    "file_compression": "",
    "file_id": "MYFILE_789",
    "file_type": "csv",
    "is_uploaded": true,
    "is_valid": true,
    "name": "my nodes",
    "parser_options": {},
    "sql_transforms": null,
    "tables_schemas": [
      {
        "dtypes": {
          "n": "object",
          "v": "int64",
          "v2": "object"
        },
        "name": "Untitled 0",
        "num_cols": 3,
        "num_rows": 3
      }
    ],
    "updated_at": "2021-02-12T19:35:37.652678Z"
}  

We used JQ to extract and store variable NODES_FILE_ID=MYFILE_789.

5. Create a visualization Dataset

Finally, we tell Graphistry how to combine our files into a graph.

Recall the structure of our dataset. The graphistry_edges.csv has ID columns s and d, while graphistry_nodes.csv has ID column n. We create a JSON snipped to specify those bindings, and use the file_id values to refer to the files:

DATASET_BINDINGS="{\
    \"node_encodings\": {\"bindings\": {\"node\": \"n\"}},\
    \"edge_encodings\": {\"bindings\": {\"source\": \"s\", \"destination\": \"d\"}},\
    \"node_files\": [\"${NODES_FILE_ID}\"],\
    \"edge_files\": [\"${EDGES_FILE_ID}\"],\
    \"metadata\": {},\
    \"name\": \"my csv viz\"\
}"
OUT=$( curl -fsv -X POST \
	-H "Authorization: Bearer $GRAPHISTRY_TOKEN" \
	-H "Content-Type: application/json" \
	-d"$DATASET_BINDINGS" \
	${GRAPHISTRY_BASE_PATH}/api/v2/upload/datasets/ )
echo $OUT | jq .
export GRAPHISTRY_DATASET_ID=$( echo "$OUT" | jq -jr .data.dataset_id )
echo
echo "URL: ${GRAPHISTRY_BASE_PATH}/graph/graph.html?dataset=${GRAPHISTRY_DATASET_ID}"

Upon successful execution, we should see embeddable URL https://hub.graphistry.com/graph/graph.html?dataset=abc123.

Further reading and resources

Tutorial materials

API calls

  • File - Create, populate, patch, and delete
  • Dataset - Create and delete
  • URL embedding API - Style an existing visualization via URL parameters
  • JavaScript client - Dynamically embed and control a visualization using clientside JavaScript