Upload Data (2.0 API)
Concepts
The 2.0 lowermost upload REST API involves the following steps:
- Generate a short-living JWT session token using your account credentials
- To upload into an organization, specify field
org_name
forFile
s andDataset
s - Optionally (recommended) upload your data as
File
s. The fastest, most scalable, and most reliable formats are ORC/Parquet/Arrow formats. - Create a visualization
Dataset
with any optional configuration (JSON) describing how to turn your data into a graph. If you didn't upload data in the previous step (recommended), you can do it here.
Notes:
-
Graphistry's built-in notebook server: If you are using the REST API through the provided Jupyter Notebook, you will
likely want to use the Docker internal network base path of
http://nginx
. - Trailing slashes: Take note of the use of trailing slashes ("
/
").
For examples, the documentation includes example curl calls. For Python examples, see PyGraphistry's ArrowUploader class.
List Datasets
Route | Method | Headers | Parameters | Return |
---|---|---|---|---|
api/v2/datasets/ |
GET |
Content-Type: application/json Authorization: Bearer YOUR_JWT_TOKEN |
{ ?"limit": int, ?"offset": int, } |
{ "count": int, "next": str | null, "previous": str | null, "results": [{ "dataset_id": str, "name": str, "slug": str | null, "author": int, "node_count": int | null, "edge_count": int | null, "agent_name": str, "agent_version": str, "uri_tag": str, "description": str, "image": str, "created_at": str, "updated_at": str, "node_files": [str | [ str, int ]], "edge_files": [str | [ str, int ]], "node_transform": str, "edge_transform": str, "edge_hypergraph_transform": [{}], "compute": json, "layout": json, "node_encodings": { bindings: json, }, "edge_encodings": { bindings: json, }, "metadata": json, "timestamp": str, "legacy_dataset_name": str | null, "legacy_dataset_author": str | null }] } |
Input:
|
||||
Output:
|
Create dataset
There are serveral sets of options:Logo style follows the CSS filter specification and CSS gradients. Blend mode values follow the CSS mix-blend-mode specification. For setting additional style properties, see the URL API and how to set encodings.
The node_files
and edge_files
refer to the string file_id
returned by /file/
API calls and the index of the table(s) in them. You can use them in two ways:
- First/only table in the file:
edge_files: [ "file_id123" ]
- Arbitrary table #22 in the file:
edge_files: [ ["file_id123", 22] ]
Route | Method | Headers | Parameters | Return |
---|---|---|---|---|
api/v2/upload/datasets/ |
POST |
Content-Type: application/json Authorization: Bearer YOUR_JWT_TOKEN |
{ ?"org_name": str, "node_encodings": { "bindings": { ?"node": str, ?"node_color": str, ?"node_label": str, ?"node_opacity": str, ?"node_size": str, ?"node_title": str, ?"node_weight": str }, ?"complex": [ see complex section ] }, ?"node_files": [ str | [ str, int ] ], "edge_encodings": { "bindings": { "source": str, "destination": str, ?"edge_color": str, ?"edge_label": str, ?"edge_opacity": str, ?"edge_size": str, ?"edge_title": str, ?"edge_weight": str }, ?"complex": [ see complex section ] }, ?"edge_hypergraph_transform": [{ ?"entity_types": [ str ], ?"direct": bool, ?"drop_edge_attrs": bool, ?"opts": { ?'EVENTID': str, ?'CATEGORIES': { str: [ str ] }, ?'DELIM': str, ?'SKIP': [ str ], ?'EDGES': { str: [ str ] } } }] ?"edge_files": [ str | [ str, int ] ], "metadata": { ?[ see metadata section ] } "name": str, ?"description": str } |
{ "data": {"dataset_id": str}, "message": str, "success": bool } |
Input:
|
||||
Output:
|
||||
Input:
|
||||
Output:
|
Delete dataset
Route | Method | Headers | Parameters | Return |
---|---|---|---|---|
api/v2/datasets/<dataset_id> |
DELETE |
Content-Type: application/json Authorization: Bearer YOUR_JWT_TOKEN |
HTTP code 204 | |
Input:
|
||||
Output:
|
Hypergraphs
The hypergraph transform gets enabled when using the parameter edge_hypergraph_transform
.
The hypergraph transforms creates a node for every unique value in the entity_types columns (default: all columns). If direct=False (default), every row is also turned into a node.
The transform will dictate the src/dst edge columns based on whether direct=True
(src/dst vs EventID/attribID). By default, the node ID column should be nodeID
, unless you provide your own node table.
Ex: Connect all columns through the row: row -> employee, boss, subsidary, year_hired, ...
{
"node_encodings": {"bindings": {"node": "nodeID"}},
"edge_encodings": {"bindings": {"source": "EventID", "destination": "attribID"}},
"edge_files": ["FILE_123"],
"metadata": {},
"name": "hyper_123",
"edge_hypergraph_transform": [{}]
}
Ex: Connect employee, boss, and subsidary together (no rows), merge employee/boss nodes that use the same name, control edges creation, and do not show nulls
{
"node_encodings": {"bindings": {"node": "nodeID"}},
"edge_encodings": {"bindings": {"source": "src", "destination": "dst"}},
"edge_files": ["FILE_123"],
"metadata": {},
"name": "hyper_123",
"edge_hypergraph_transform": [{
"entity_types": [ "employee", "boss", "subsidary" ],
"direct": true,
"drop_na": true,
"drop_edge_attrs": true,
"opts": {
"CATEGORIES": {
"person": [ "employee", "boss" ]
},
"EDGES": {
"person": [ "boss", "subsidary" ],
"subsidary": [ "boss" ]
}
}
}]
}