Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update C API graph creation function signatures #3982

Merged
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
160 changes: 152 additions & 8 deletions cpp/include/cugraph_c/graph.h
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,11 @@ typedef struct {
bool_t is_multigraph;
} cugraph_graph_properties_t;

// FIXME: Add support for specifying isolated vertices
/**
* @brief Construct an SG graph
*
* @deprecated This API will be deleted, use cugraph_graph_create_sg instead
*
* @param [in] handle Handle for accessing resources
* @param [in] properties Properties of the constructed graph
* @param [in] src Device array containing the source vertex ids.
Expand All @@ -55,7 +56,6 @@ typedef struct {
* integer values from 0 to num_vertices.
* @param [in] do_expensive_check If true, do expensive checks to validate the input data
* is consistent with software assumptions. If false bypass these checks.
* @param [in] properties Properties of the graph
* @param [out] graph A pointer to the graph object
* @param [out] error Pointer to an error object storing details of any error. Will
* be populated if error code is not CUGRAPH_SUCCESS
Expand All @@ -76,9 +76,54 @@ cugraph_error_code_t cugraph_sg_graph_create(
cugraph_graph_t** graph,
cugraph_error_t** error);

/**
* @brief Construct an SG graph
*
* @param [in] handle Handle for accessing resources
* @param [in] properties Properties of the constructed graph
* @param [in] vertices Optional device array containing a list of vertex ids
* (specify NULL if we should create vertex ids from the
* unique contents of @p src and @p dst)
Comment on lines +85 to +87
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does NULL vertices behave with and w/o renumbering? What if edge data only contain unique ids of e.g. 1, 3, and 5? Are 0, 2, and 4 implied if renumber is False?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If vertices is NULL, this will provide the same behavior as today. That is:

  • Renumber enabled will ignore vertices 0, 2, 4... they will not exist in the graph. We will create a graph with 3 vertices which will be internally numbered 0, 1, 2. Any function that is called will translate any vertex id inputs to the mapping 0, 1, 2 (e.g. if you call BFS and specify a seed of 1, it will be mapped to whichever id that vertex 1 was mapped to), and any vertex ids outputs will be renumbered from the internal mapping (0, 1, 2) to the external mapping (1, 3, 5)
  • Renumber disabled will use the vertex ids exactly as specified, so vertices 0, 2 and 4 will be isolated vertices in the graph. However, if your graph also had a vertex 6... that vertex would be lost even with renumber disabled, since there is no mechanism to discover that vertex 6 exists in your graph with the current (and proposed API).

* @param [in] src Device array containing the source vertex ids.
* @param [in] dst Device array containing the destination vertex ids
* @param [in] weights Device array containing the edge weights. Note that an unweighted
* graph can be created by passing weights == NULL.
* @param [in] edge_ids Device array containing the edge ids for each edge. Optional
argument that can be NULL if edge ids are not used.
* @param [in] edge_type_ids Device array containing the edge types for each edge. Optional
argument that can be NULL if edge types are not used.
* @param [in] store_transposed If true create the graph initially in transposed format
* @param [in] renumber If true, renumber vertices to make an efficient data structure.
* If false, do not renumber. Renumbering is required if the vertices are not sequential
* integer values from 0 to num_vertices.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this documentation is a bit misleading. In cuGraph, renumbering is more than just mapping integers to consecutive integers starting from 0. cuGraph renumbers vertex IDs in a specific way for various optimizations. This documentation may be interpreted that renumbering is unnecessary if vertex IDs are consecutive integers starting from 0. AFAIK, renumber = false's only remaining use case is for debugging.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated documentation for each of the renumber parameters in this file in the next push.

* @param [in] do_expensive_check If true, do expensive checks to validate the input data
* is consistent with software assumptions. If false bypass these checks.
* @param [out] graph A pointer to the graph object
* @param [out] error Pointer to an error object storing details of any error. Will
* be populated if error code is not CUGRAPH_SUCCESS
*
* @return error code
*/
cugraph_error_code_t cugraph_graph_create_sg(
const cugraph_resource_handle_t* handle,
const cugraph_graph_properties_t* properties,
const cugraph_type_erased_device_array_view_t* vertices,
const cugraph_type_erased_device_array_view_t* src,
const cugraph_type_erased_device_array_view_t* dst,
const cugraph_type_erased_device_array_view_t* weights,
const cugraph_type_erased_device_array_view_t* edge_ids,
const cugraph_type_erased_device_array_view_t* edge_type_ids,
bool_t store_transposed,
bool_t renumber,
bool_t do_expensive_check,
cugraph_graph_t** graph,
cugraph_error_t** error);

/**
* @brief Construct an SG graph from a CSR input
*
* @deprecated This API will be deleted, use cugraph_graph_create_sg_from_csr instead
*
* @param [in] handle Handle for accessing resources
* @param [in] properties Properties of the constructed graph
* @param [in] offsets Device array containing the CSR offsets array
Expand All @@ -95,7 +140,6 @@ cugraph_error_code_t cugraph_sg_graph_create(
* integer values from 0 to num_vertices.
* @param [in] do_expensive_check If true, do expensive checks to validate the input data
* is consistent with software assumptions. If false bypass these checks.
* @param [in] properties Properties of the graph
* @param [out] graph A pointer to the graph object
* @param [out] error Pointer to an error object storing details of any error. Will
* be populated if error code is not CUGRAPH_SUCCESS
Expand All @@ -116,19 +160,66 @@ cugraph_error_code_t cugraph_sg_graph_create_from_csr(
cugraph_graph_t** graph,
cugraph_error_t** error);

/**
* @brief Construct an SG graph from a CSR input
*
* @param [in] handle Handle for accessing resources
* @param [in] properties Properties of the constructed graph
* @param [in] offsets Device array containing the CSR offsets array
* @param [in] indices Device array containing the destination vertex ids
* @param [in] weights Device array containing the edge weights. Note that an unweighted
* graph can be created by passing weights == NULL.
* @param [in] edge_ids Device array containing the edge ids for each edge. Optional
argument that can be NULL if edge ids are not used.
* @param [in] edge_type_ids Device array containing the edge types for each edge. Optional
argument that can be NULL if edge types are not used.
* @param [in] store_transposed If true create the graph initially in transposed format
* @param [in] renumber If true, renumber vertices to make an efficient data structure.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does renumber=True work for CSR? Wouldn't offsets already imply 0..N-1 vertex ids?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renumber=True would still work for CSR input. Yes, offsets does imply vertex ids in the range [0, N). But our renumbering logic adds some performance optimizations by putting the vertex ids in an order that makes the CUDA kernels more efficient. So there is value in specifying renumber=True even if the vertex ids are densely packed.

I will be fixing the C API implementation of this function so that it will properly reflect the isolated vertices that can be implied in CSR format.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we should update the documentation to better reflect that renumbering is more than just packing integers and highly recommended in most practical use cases.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated documentation for each of the renumber parameters in this file in the next push.

* If false, do not renumber. Renumbering is required if the vertices are not sequential
* integer values from 0 to num_vertices.
* @param [in] do_expensive_check If true, do expensive checks to validate the input data
* is consistent with software assumptions. If false bypass these checks.
* @param [out] graph A pointer to the graph object
* @param [out] error Pointer to an error object storing details of any error. Will
* be populated if error code is not CUGRAPH_SUCCESS
*
* @return error code
*/
cugraph_error_code_t cugraph_graph_create_sg_from_csr(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since cugraph_graph_create_sg supports flags to drop both self loops and multi edges through the methods cugraph::remove_self_loops and cugraph::sort_and_remove_multi_edges, shouldn't cugraph_graph_create_sg_from_csr also support those flags? In fact, cugraph_graph_create_sg_from_csr extracts the COO from the CSR before calling cugraph::create_graph_from_edgelist therefore, we can drop self loops and multi edges (if set to True) right before the graph question.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add that feature. I was assuming that if you already have a CSR as input then you have already done whatever ETL steps are desired.

const cugraph_resource_handle_t* handle,
const cugraph_graph_properties_t* properties,
const cugraph_type_erased_device_array_view_t* offsets,
const cugraph_type_erased_device_array_view_t* indices,
const cugraph_type_erased_device_array_view_t* weights,
const cugraph_type_erased_device_array_view_t* edge_ids,
const cugraph_type_erased_device_array_view_t* edge_type_ids,
bool_t store_transposed,
bool_t renumber,
bool_t do_expensive_check,
cugraph_graph_t** graph,
cugraph_error_t** error);

/**
* @brief Destroy an graph
*
* @param [in] graph A pointer to the graph object to destroy
*/
void cugraph_graph_free(cugraph_graph_t* graph);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this function works for both SG&MG graphs, right? Then, should we place this function after SG & MG graph creation functions? I was searching for a new graph deletion function for MG (replacing the deprecated function) and it took sometime to find that this function works for both SG & MG.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can move this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved all of the free functions so that they are together in the file.


/**
* @brief Destroy an SG graph
*
* @deprecated This API will be deleted, use cugraph_graph_free instead
*
* @param [in] graph A pointer to the graph object to destroy
*/
// FIXME: This should probably just be cugraph_graph_free
// but didn't want to confuse with original cugraph_free_graph
void cugraph_sg_graph_free(cugraph_graph_t* graph);

// FIXME: Add support for specifying isolated vertices
/**
* @brief Construct an MG graph
*
* @deprecated This API will be deleted, use cugraph_graph_create_mg instead
*
* @param [in] handle Handle for accessing resources
* @param [in] properties Properties of the constructed graph
* @param [in] src Device array containing the source vertex ids
Expand Down Expand Up @@ -165,13 +256,66 @@ cugraph_error_code_t cugraph_mg_graph_create(
cugraph_graph_t** graph,
cugraph_error_t** error);

/**
* @brief Construct an MG graph
*
* @param [in] handle Handle for accessing resources
* @param [in] properties Properties of the constructed graph
* @param [in] vertices List of device arrays containing the unique vertex ids.
* If NULL we will construct this internally using the unique
* entries specified in src and dst
* All entries in this list will be concatenated on this GPU
* into a single array.
ChuckHastings marked this conversation as resolved.
Show resolved Hide resolved
* @param [in] src List of device array containing the source vertex ids
* All entries in this list will be concatenated on this GPU
* into a single array.
* @param [in] dst List of device array containing the destination vertex ids
* All entries in this list will be concatenated on this GPU
* into a single array.
* @param [in] weights List of device array containing the edge weights. Note that an
* unweighted graph can be created by passing weights == NULL. If a weighted graph is to be
* created, the weights device array should be created on each rank, but the pointer can be NULL and
* the size 0 if there are no inputs provided by this rank All entries in this list will be
* concatenated on this GPU into a single array.
* @param [in] edge_ids List of device array containing the edge ids for each edge. Optional
* argument that can be NULL if edge ids are not used.
* All entries in this list will be concatenated on this GPU
* into a single array.
* @param [in] edge_type_ids List of device array containing the edge types for each edge.
* Optional argument that can be NULL if edge types are not used. All entries in this list will be
* concatenated on this GPU into a single array.
* @param [in] store_transposed If true create the graph initially in transposed format
* @param [in] num_arrays The number of arrays specified in @p vertices, @p src, @p dst, @p
* weights, @p edge_ids and @p edge_type_ids
* @param [in] do_expensive_check If true, do expensive checks to validate the input data
* is consistent with software assumptions. If false bypass these checks.
* @param [out] graph A pointer to the graph object
* @param [out] error Pointer to an error object storing details of any error. Will
* be populated if error code is not CUGRAPH_SUCCESS
* @return error code
*/
cugraph_error_code_t cugraph_graph_create_mg(
cugraph_resource_handle_t const* handle,
cugraph_graph_properties_t const* properties,
cugraph_type_erased_device_array_view_t const* const* vertices,
cugraph_type_erased_device_array_view_t const* const* src,
cugraph_type_erased_device_array_view_t const* const* dst,
cugraph_type_erased_device_array_view_t const* const* weights,
cugraph_type_erased_device_array_view_t const* const* edge_ids,
cugraph_type_erased_device_array_view_t const* const* edge_type_ids,
bool_t store_transposed,
size_t num_arrays,
bool_t do_expensive_check,
cugraph_graph_t** graph,
cugraph_error_t** error);

/**
* @brief Destroy an MG graph
*
* @deprecated This API will be deleted, use cugraph_graph_free instead
*
* @param [in] graph A pointer to the graph object to destroy
*/
// FIXME: This should probably just be cugraph_graph_free
// but didn't want to confuse with original cugraph_free_graph
void cugraph_mg_graph_free(cugraph_graph_t* graph);

/**
Expand Down
12 changes: 12 additions & 0 deletions cpp/include/cugraph_c/resource_handle.h
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,18 @@ typedef struct cugraph_resource_handle_ {
*/
cugraph_resource_handle_t* cugraph_create_resource_handle(void* raft_handle);

/**
* @brief get comm_size from resource handle
*
* If the resource handle has been configured for multi-gpu, this will return
* the comm_size for this cluster. If the resource handle has not been configured for
* multi-gpu this will always return 1.
*
* @param [in] handle Handle for accessing resources
* @return comm_size
*/
int cugraph_resource_handle_get_comm_size(const cugraph_resource_handle_t* handle);

/**
* @brief get rank from resource handle
*
Expand Down
Loading
Loading